Journal of Applied Psychology 


Joun G. Dartey, Editor 
University or MINNESOTA 


LoraainE Boutniret, Managing Editor 





Table of Contents 


Use of a We 
Maetzol 


Time in Training as a Criterion of Success in Radio Code: L. V. Gordon 
Expected Locations of Digits and Letters on Ten-Button Keysets: M. C. Lutz and A. Chapanis 314 


Comparison of Group Paper-and-Pencil Tests with Certain Psychophysical Tests for Measur- 
ing Driving Aptitude of Army Personnel: A. R. Lauer 318 


The —— of Individual Differences in Susceptibility to Industrial Monotony: P. C. 
mit 


Positive Aspects of Motivation in Repetitive Work : Effects of Lot Size upon Spacing of Volun- 
tary Work Stoppages: P. C. Smith and C. Lem 


Group Decision and Employee Participation: L. C. Lawrence and P. C. Smith 
Evaluating Potential Officer Effectiveness in a Training Situation: B. J. Suttell 


The Development of a Job Sample Trouble-Shooting Performance Examination: A. I. Siegel 
and J. Jensen 


Some Aspects of the Executive Personality: J. B. Miner and J. E. Culver 

Group Performance in a Cognitive Task: A. L. Comrey and C. K. Staats 

The Effect of Randomizing the Cleeton Vocational Interest Inventory Items: R. T. Kelleher 357 
Personality Adjustment and the Study of Abnormal Psychology: E. S. Mills 


Personality Traits and Persistence of Interest in Teaching as a Vocational Choice: A. C. 
LaBue 362 


The gg ae Art Scale as a Predictor of Originality and Level of Ability Among Artists: 
. C. Rosen 


A — Analysis of the Effectiveness of Psychological Warfare: L. A. Kahn and T. G. 
ndrews 


A Model for the Analysis of Consumer Preference and an Exploratory Test: P. H. Benson... 375 
Test Validity over a Seventeen-Year Period: E. B. Knauft 

Book Reviews 

New Books, Monographs, and Pamphlets 





American Psychological Association 


Volume 39, Number 5 October, 1955 





Consulting Editors 


Harold E. Burtt, Ohio State University 
Alphonse Chapanis, Johns Hopkins Univer- 
sity 


Clifford E. Jurgensen, Minneapolis Gas 
Company 

Laurence S. McGaughran, University of 
Houston 


Quinn McNemar, Stanford University 

Alexander Mintz, City College of New York 

Harold F. Rothe, Fairbanks, Morse and 
Company 

Julian B. Rotter, Ohio State University 

Donald E. Super, Columbia University 

Miles A. Tinker, University of Minnesota 





This journal gives primary consideration to origi- 
nal investigations in any field of applied psychol- 
ogy except clinical and consulting psychology, al- 
though a descriptive or theoretical article may be 
accepted if it represents a special contribution in 
an applied field. Quantitative investigations of in- 
terest or value to psychologists working in the fol- 
lowing broad fields will be considered: vocational 
and educational prognosis, diagnosis, and guidance 
at the secondary and college level; personnel re- 
search in business, industry, and government; bio- 
mechanics; industrial working conditions; research 
on opinion and morale factors; job analysis and 
classification research; market and advertising re- 
search. 


Because of the large number of manuscripts sub- 
mitted, authors should adhere to the rule of 


“brevity consistent with clarity.” The typical 
manuscript should run to approximately 4,000 
words. There is a lag of approximately twelve 
months between receipt and publication of an 
article. Authors may request advanced publica- 
tion if they are prepared to pay the cost of print- 
ing the necessary extra pages. 

Manuscripts should be addressed to the Editor, 
John G. Darley, 408 Johnston Hall, University of 
Minnesota, Minneapolis 14, Minnesota. All manu- 
scripts should be submitted in duplicate. Original 
figures are preparec for publication; duplicate fig- 
ures may be photographic or pencil-drawn copies. 

Manuscripts must conform to the style require- 
ments described in the “Publication Manual of the 
American Psychological Association,” Psychol. Bull., 
1952, 49, No. 4, Part 2. 





Journal of Applied Psychology 


Published bimonthly by the 
American Psychological Association 
Prince and Lemon Sts., Lancaster, Pa. 
and 1333 Sixteenth Street N.W. 
Washington 6, D. C. 


$7.00 per volume 


$1.50 per issue 


Subscriptions, orders, and business communications should be addressed to the American Psychological Association, 
1333 Sixteenth St. N.W., Washington 6, D. C. Address changes must reach the subscription office by the 25th of 


the month to take effect the following month. 
subscribers should notify the post office that 


Undelivered copies resulting from address changes will not be replaced; 
they will guarantee second-class forwarding 


postage. Other claims for 


undelivered copies must be made within four months of publication. 
Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the act of March 3, 1879. 
ce for mailing at the special rate of postage provided for in paragraph (d-2), Section 34.40, P. L. & R. 


Acceptance 
of 1948, authorized October 10, 1947. 


Copyright, 1955, by the American Psychological Association, Inc. 





Journal of Applied Psychology 








VoL. 39, No. 5 


OcTOBER, 1955 








Item Weights in Employee Rating Scales 


C. E. Jurgensen 


Minneapolis Gas Company 


Statisticians have repeatedly pointed out 
that parts combined into a composite are 
weighted in proportion to their standard 
deviations. This is true whether the parts 
consist of test scores, test items, or other 
variables. Applied to rating-scale procedure, 
this means that items in a rating scale are 
not necessarily weighted equally even though 
identical weights are assigned each item. The 
“real” weights are equal only if the stand- 
ard deviations are equal. Also, differential 
weights do not weight items in proportion to 
the assigned weights unless the standard 
deviations are equal. 

Industry has flagrantly and typically vio- 
lated these weighting principles. This has 
been particularly true in the field of merit 
rating and job analysis. It is the rule rather 
than the exception to see an employee-rating 
scale in which adjectives connoting degrees 
of merit (e.g., excellent, good, fair, poor) are 
assigned weights such as 4—3—2-1, or 100—75— 
50-25. An equally common error is that of 
assigning a higher maximum number of points 
to one trait than to another (e.g., assigning a 
maximum of 200 points to quality of work as 
compared with 50 points to health) with the 
expectation that the first trait is weighted 
four times as heavily as the second. The 
failure of typical industrial practice to agree 
with statistical theory has long been a matter 
of concern to numerous industrial psycholo- 
gists, including Tiffin and Musser (2). 

A study by Jurgensen (1) dealt with an 
employee-rating scale consisting of four over- 
all ratings, one being a four-point scale, one 
five-, another six-, and another a seven-point 
scale. Ratings were obtained on 810 em- 
ployees and weights were determined by con- 


verting ratings to normalized standard scores 
using mid-point percentiles. The method 
proved highly successful in actual use, and 
three additional scales were developed ac- 
cording to the same principles. A simple 
preliminary report was devised for use 30 
days after employment. It consisted of. four 
over-all ratings using scales having from two 
to five rating categories or points. A pro- 
bationary report was devised for use 90 days 
after employment consisting of six over-all 
ratings ranging from two- to seven-point 
scales. The third form was for use with 
permanent employees and consisted of six 
over-all ratings containing from three- to 
nine-point scales. In accordance with sta- 
tistical theory and approved practice, all rat- 
ings were weighted according to their vari- 
ability, with weights being assigned on the 
basis of normalized scores. 

The study reported here arose from a sim- 
ple question. How much better are statisti- 
cally determined weights than those which 
are arbitrarily assigned in serial order? 
There was no question that the statistically 
determined weights would be superior to the 
arbitrary weights. The question was how 
much better. The two weighting methods 
were expected to have a relatively low cor- 


‘relation and the superiority of the statistically 


determined weights was expected to be appar- 
ent in many respects. No such data seem to 
be available in the literature, but would be 
valuable in convincing personnel people to 
follow a statistically acceptable rather than 
a “common-sense” method. 

The six over-all ratings comprising one of 
the forms (Permanent Employee report) are 
given in Table 1, together with the statisti- 
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cally determined weights (Column A) and the 
arbitrary serial weights (Column B). These 
items will illustrate the nature of each of the 
three scales. 

In one respect the forms used in this study 
are not particularly good for the purpose of 
comparing statistically determined versus ar- 
bitrarily assigned weights. Rating categories 
within each of the over-all ratings were se- 
lected with the hope that they would have 
equal-appearing intervals. This is a typical 
hope in rating-scale construction, but the de- 
gree of fulfillment is rarely known. To the 
extent that the desire is fulfilled, however, it 
might be expected that statistical and arbi- 
trary weights would be correlated. 

In another respect, however, these reports 
illustrate the desired principles very ade- 
quately. Arbitrary weights were assigned by 
designating the least favorable category as 1, 
and assigning numbers serially to remaining 
steps in the scale. As is seen in Table 1, the 
highest weight in the last item is 3, and is as- 
signed to the description “Definitely outstand- 
ing.”’ However, a weight of 3 on the preced- 


ing item indicates a “Poor” employee. 


Results 


The statistical versus arbitrary weights 
were correlated for each of the three rating 
forms: Preliminary, Probationary, and Per- 
manent Employee. The correlations were: 
.995, .996, and .994 for n’s of 245, 245, and 
210, respectively. Obviously, these correla- 
tions show a high degree of equivalence be- 
tween statistically determined and arbitrarily 
assigned weights. There remains the possi- 
bility, however, that reliability of composite 
scores based on statistical weights is appre- 
ciably higher than reliability based on arbi- 
trary weights. 

Split-half reliabilities were computed for 
the statistical and arbitrary weights, and were 
found to be as follows for the three scales: 
.791 and .785 for the Preliminary rating form, 
.903 and .920 for the Probationary rating 
form, and .937 and .920 for the Permanent 
Employee form. Again, statistically deter- 
mined weights were found to be no better 
than arbitrarily assigned weights. 

The 245 persons on whom Preliminary rat- 
ings were available were the same individuals 


Table 1 


Over-all Ratings from Permanent Employee Report, 
Together with Normalized Standard Score 
Weights (Col. A) and Arbitrary 
Weights (Col. B) 








Rating Categories 





— 


How well does this employee meet the re- 
quirements of the job? ‘ 

Does not meet job requirements 

Partially meets job requirements 

Slightly below job requirements 

Meets job requirements 

Slightly above job requirements 

Exceeds job requirements 

Far exceeds job requirements 


Would you hire this employee over again 
if you could make the decision? 

Definitely yes 

Probably yes 

Probably no 

Definitely no 


How satisfied are you with this employee? 
Very disappointed with employee 

Quite disappointed with employee 
Somewhat disappointed with employee 
Generally satisfied with employee 

Well satisfied with employee 

Exceedingly well satisfied with employee 


Where would you rank this employee in a 
large group of persons holding this same 
job? 

Highest 10% 

Next highest 20% 

Middle 40% 

Next lowest 20% 

Lowest 10% 


Which of the following terms best describes 
the over-all job performance of this 
employee? 

Unacceptable 

Very poor 

Poor 

Average minus 

Average 

Average plus 

Good 

Excellent 

Very superior 


CmnNOA UF WN 
Cona unk WN 


Which of the following do you consider the 
employee to be? 

Definitely outstanding 

Satisfactory 

Definitely a problem 
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on whom Probationary ratings were avail- 
able. These were correlated to determine if 
statistical weights resulted in a correlation 
significantly different from that obtained 
from arbitrary weights. The correlation be- 
tween the two rating forms was .397 for 
statistical weights, and .399 for arbitrary 
weights. Again, no superiority can be as- 
cribed to statistically determined weights. 
(Incidentally, though irrelevant to the pur- 
poses of this paper, these low correlations do 
not necessarily indicate low repeat reliability 
of the forms used. Other data indicate that 
the basic reason for low correlation is the 
difficulty if not impossibility of making ac- 
curate judgments on these employees after 
only thirty days’ employment.) 

Fifty-five employees had been rated by two 
different raters. This interrater reliability 
was found to be .750 using statistically deter- 
mined weights, and .751 using arbitrary 
weights. Once again, statistically determined 
weights failed to be superior to arbitrarily 
assigned weights. 


Summary and Conclusion 


Data from three rating forms were ana- 
lyzed to determine the extent of superiority 
of statistically determined weights to arbi- 
trarily assigned weights. Correlations be- 
tween statistically determined and arbitrarily 
assigned weights were so high that they can 
be considered to be one and the same. Split- 
half reliability showed no difference between 
statistical and arbitrary weights, correlations 
between two rating forms were the same for 
both, and interrater reliability showed no dif- 
ference between weights. Inasmuch as no 
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data in this study indicate any superiority 
whatsoever of statistically determined weights 
as compared to arbitrarily assigned weights, 
the conclusion is reached that simplified, ar- 
bitrarily assigned weights can be as useful 
and accurate as the more elaborately deter- 
mined weights based on approved statistical 
methodology. 

The author has neither intent nor desire to 
oppose or belittle statistical methodology with 
respect to item weighting; he is thoroughly 
convinced of the theoretical superiority of 
statistical methods for determining weights. 
The fact such superiority was not evident in 
this study does not indicate that statistical 
methodology should be discarded or mini- 
mized in other situations. The difficulties 
and inaccuracies of employee rating are well- 
known. It may be that these are so great as 
to put ratings beyond help of accurate and 
statistically determined weights. Perhaps we 
are trying to measure the width of a river by 
holding a millimeter rule while swimming 
across the river. Be this as it may, the fact 


remains that statistically determined weights 
did not add one iota to the arbitrarily as- 


signed. weights in the use of these rating 
scales in an industrial situation. The burden 
of proof would appear to fall upon those who 
advocate statistically determined weights in 
situations of this type. 


Received October 18, 1954. 
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Use of a Weighted Application Blank in Hiring Seasonal 
Employees 


Marvin D. Dunnette 
Industrial Relations Center, University of Minnesota 


and James Maetzold 
The Green Giant Company 


Numerous studies have shown the applica- 
tion blank to be a valuable predictive device 
in the selection of employees. Personal fac- 
tors such as age, marital status, participation 
in social and professional organizations, etc. 
often are closely correlated with length of 
service on a job and with the degree of effec- 
tiveness realized in the performance of the 
job. 

Most published research on application 
blanks, however, has been limited in two im- 
portant ways: (a) weights for the application 
blank have been developed separately for dif- 
ferent types of jobs, thus restricting use of 
the technique to positions with large num- 
bers of employees; and (b) major emphasis 
has been given to the selection of salesmen 
and clerical employees; application of the 
technique to production workers has been 
uncommon. 

The special problem described in this ar- 
ticle demanded a selection tool that could be 
applied to employees doing different kinds 
of production jobs. Therefore, a study was 
conducted to discover whether or not the 
weighted application technique might be ex- 
tended to this situation. 


The Problem 


The Green Giant Company operates eight 
plants in Minnesota devoted primarily to the 
seasonal canning of peas and corn. During 
June, July, August, and September, the firm’s 
manpower needs undergo tremendous and 
rapid expansion. Sources of labor to meet 
this demand consist, for the most part, of 
local housewives and other local part-time 
employees, high school and college students, 
transients, and migratory workers. At each 
plant a core of seasonal employees (house- 
wives, etc.) can be counted on to return year 
after year and work steadily throughout the 
summer. These may be called “seasonal 
regulars.” 


Many positions, however, must be filled 
anew each summer. Persons hired for these 
jobs often present serious turnover problems. 
It is not uncommon for an applicant to state 
on his application blank that he will work all 
summer only to leave with no apparent reason 
after working only a short time. Such rapid 
turnover proves, of course, to be expensive 
and inconvenient. This study deals with the 
possibility of improving the condition by de- 
veloping scoring weights for items of the ex- 
isting application blank. The weights are se- 
lected in such a way as to eliminate the maxi- 
mum number of undesirable candidates and a 
minimum of potentially stable workers. 


Method 


Scoring weights were developed first for employees 
working at the home plant in Le Sueur, Minnesota. 
The plan was to repeat the procedure at other plants 
if the Le Sueur results appeared promising. Jobs 
which called for special application blanks (e.g., 
trucker) were excluded from the study, as were the 
data for all women employees since they had been 
shown to be quite stable with respect to turnover. 
Application blanks for seasonal employees who 
worked during the summer of 1951 were used to de- 
termine scoring weights. Blanks for the 1952 sum- 
mer employees were used to cross validate the re- 
sults obtained, and the technique was applied to all 
applications received by four Green Giant plants 
during 1953. Steps taken in this series of studies 
are described below. 

A precise measure of employee turnover was not 
available. Many persons, at the time of hiring, said 
they would stay only a short time (a few days or 
weeks) and did so. Other employees left employ- 
ment for good reasons (such as returning to school 
or being drafted). Such workers do not constitute 
the undesirable turnover risks for which identifica- 
tion was desired. ' Fortunately, the employment man- 
ager of the Le Sueur operation could remember cir- 
cumstances surrounding the hiring and leaving of 
nearly all seasonal employees. He was asked to 
separate the application blanks for 1951 in the fol- 
lowing way: 


1. First, he removed the core group—the “sea- 
sonal regulars” described above. The applications of 
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these persons were not analyzed because the “sea- 
sonal regulars” were well known and could be de- 
pended upon year after year. No scoring weights 
or application-blank analysis was necessary for this 
task. 

2. Remaining employees were judged as good, poor, 
or doubtful with respect to turnover. Applications 
placed in the doubtful category included those whom 
the employment manager did not remember, those 
who had valid reasons for leaving, and those who 
were undesirable employees for reasons other than 
turnover. 

3. In order to determine the consistency with 
which the employment manager could identify good 
and poor turnover employees, 105 application blanks 
were chosen randomly from the total. After these 
had been sorted into their respective categories, they 
were “shuffled” and re-sorted two weeks later. There- 
fore, it is unlikely that memory was an important 
factor in determining how the blanks were sorted 
the second time. Of 40 persons classified good in 
the first categorization, 36 (90%) were so classified 
in the second categorization. Of 33 persons classi- 
fied poor in the first categorization, 27 (82%) were 
so classified in the second categorization. It was 
concluded that good and poor turnover risks could 
be judged with sufficient reliability to warrant their 
use as criterion groups. 

Application blanks assigned to the extreme cate- 
gories (good and poor) were used in the derivation 
of scoring weights. These groups numbered 269 and 
250, respectively. 


Results 


The percentage of good employees falling 
in each application-blank item category was 
compared with the percentage of poor em- 
ployees in the corresponding category. The 
size of the difference between percentages dic- 
tated the weight or score to be applied to the 
corresponding response on the blank. Net 
weights were assigned according to tables 
given by Stead and Shartle.t Positive scores 
are favorable to the applicant; negative scores 
are unfavorable. Of 24 items analyzed, 12 
showed differences large enough to be 
weighted. 

In terms of scoring weights, the typically 
stable Green Giant production worker lives 
in Le Sueur, has a telephone, is married and 
has no children, is not a veteran, is either 
young (under 25) or old (over 55), weighs 
more than 150 pounds but less than 175, has 
obtained more than ten years education, has 
worked for Green Giant before, will be avail- 


1W. H. Stead & C. L. Shartle, Occupational coun- 
seling techniques. New York: American Book Co., 
1940. 


Table 1 


Means, Quartiles, and Standard Deviations of Weighted 
Application-Blank Scores for 201 Poor Employees 
and 244 Good Employees Who Worked 
During the Summer of 1952 








Employees Qi Q QQ; 
Good 


N = 244 | 10 i8 9.2 
Poor 
N = 201 


Mean SD 





11.1 


—12 —2 6 —2.4 12.9 





able for work until the end of summer, and 
prefers field work to inside work. Use of 
these scoring weights on 1951 employees 
would naturally show difference in total score 
between good and poor employees. Cross 
validation was necessary in order to insure 
the stability of the scoring weights. There- 
fore, application blanks for employees who 
worked during the summer of 1952 were 
scored with a template based on the signifi- 
cant factors mentioned above. 

Table 1 shows the means, quartiles, and 
standard deviations of scores obtained by 201 
poor employees and 244 good employees. 
The difference between means is highly sig- 
nificant, statistically. That the difference is 
also of practical significance is shown by the 
low overlap between the two distributions. 
Only 16% of poor employees reach or exceed 
the median of good employees; only 18% of 
good employees are below the median of poor 
employees. 

Since these scoring weights showed good 
discrimination between good and poor em- 
ployees in this independent sample, it was 
decided to use the application blank for hir- 
ing applicants during the summer of 1953. 
In the canning industry, labor needs differ 
considerably from time to time, and the sup- 
ply of available manpower fluctuates sharply. 
Because of this factor, it was impossible to 
establish a single critical cutting score which 
could be adhered to during the whole sum- 
mer.” During any hiring period, therefore, 
the size of the selection ratio governed the 


2 Actually, a critical score of 0 would have been 
optimum. Hiring decisions based on a critical score 
of 0 would have resulted in hiring 78% of potentially 
good, or stable, employees at the same time exclud- 
ing over half of potentially poor, or unstable, em- 
ployees. 
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size of the critical score. Sometimes it was 
necessary to hire all applicants regardless of 
their score. At other times, it was possible 
to set a relatively high critical score and hire 
only the cream of the crop, so to speak. This 
use of the weighted application blank had the 
effect, probably, of reducing the amount of 
discrimination between good and poor em- 
ployees; however, judgments by the employ- 
ment manager at the end of the 1953 season 
still showed a good degree of separation. 
Table 2 gives the relevant statistics for good 
and poor employees who worked during the 
summer of 1953 at the Le Sueur plant. 
Clearly, the weighted application blank is 
stable over a period of time. The scoring 
weights, derived from 1951 employees, still 
are effective in discriminating between good 
and poor turnover risks among 1953 em- 
ployees. To what extent do these same scor- 
ing weights identify good and poor turnover 
risks hired in other Minnesota plants of the 
Green Giant Company? This question is 
considered and answered below. 


Validity Generalization 


A crucial test of the validity of scoring 
weights developed on Le Sueur employees is 
whether or not these same weights may be 
used, with valid results, to select applicants 


at other Green Giant plants. In order to 
make this test, application blanks for 1953 
seasonal employees in three other Green Giant 
firms were scored. The template used in the 
scoring was virtuaily the same as the one de- 
veloped in the pilot studies at the home office 
(Le Sueur). Four Green Giant firms in the 
Minnesota Division were excluded from the 
study because turnover is not a serious prob- 
lem at these plants; their seasonal labor force 


Table 2 


Means, Quartiles, and Standard Deviations of Weighted 
Application-Blank Scores for 129 Poor Employees 
and 121 Good Employees Who Worked 
During the Summer of 1953 








Employees Qi Qe Qs 
Good 
N = 121 14 21 
Poor 
N = 129 6 15 6.4 


Mean 





13.4 
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Table 3 


Medians and Quartiles of Weighted Application-Blank 
Scores for Good and Poor Employees at Mont- 
gomery, Blue Earth, and Glencoe 








Employees N Qi 


Good 283 6 
Poor 54 2 


Good 210 6 
Poor 62 —4 


Good 227 8 
Poor 73 —2 





Montgomery 


Blue Earth 


Glencoe 





consists almost exclusively of “seasonal regu- 
lars” who return each year. 

Table 3 shows that this weighted applica- 
tion blank effectively predicts turnover at the 
other three plants. It should be noted, how- 
ever, that the accuracy of prediction differs 
from one to another. This suggests that re- 
vision of the scoring weights based on the 
data from each plant might significantly in- 
crease the accuracy of separation. However, 
the twofold cross validation of the Le Sueur 
results and the generalization of the scoring 
weights to other firms shows that the factors 
originally identified (marital status, age, dis- 
tance from work, etc.) are consistently asso- 
ciated with turnover for these seasonal em- 
ployees. 

The major conclusion to be drawn from 
these studies is that the weighted application 
technique has proved successful in a way that 
has not been widely recognized. That these 
employees were doing different types of blue- 
collar jobs suggests an important extension 
of the method. These results show that it 
may be worth while to investigate the use of 
weighted application blanks for employees 
engaged in a wide variety of industrial jobs 
instead of limiting use of the technique to 
persons employed in sales or clerical posi- 
tions. The potential usefulness of the method, 
therefore, is broadened considerably. Since 
development of scoring weights is relatively 
simple both from a technical and from a time 
standpoint, it would merit initial study when 
problems of selection or placement are faced 
and need to be solved. 


Received September 21, 1954, 





The Journal of Applied Psychology 
Vol. 39, No. 5, 1955 


Time in Training as a Criterion of Success in Radio Code 


Leonard V. Gordon 
U.S. Naval Personnel Research Field Activity, San Diego } 


Within many training situations there is 
variation in the time taken by different indi- 
viduals to attain a required level of proficiency 
in particular skills. This occurs both when 
the trainee is free to advance at his own rate 
of learning, and where policies exist of ad- 
vancement or setback from one class or group 
to another. In such circumstances these in- 
dividual differences in time in training may 
be useful for criterion purposes. This meas- 
ure is a logical extension of the “trials to cri- 
terion” method used in laboratory learning 
studies. 

Time in training is a very practical cri- 
terion, in and of itself. It is more expensive 
to train the slower learner, representing an 
additional cost to the organization. Further- 
more, where individuals with the particular 
skills are in demand, the slower learner is, in 
effect, blocking the training of someone else 
during the extra time it takes him to learn 
his job. 

In addition, the time it takes a person to 
learn a particular skill may be prognostic of 
eventual success on the job, where growth is 
possible, since initial learning is a good pre- 
dictor of subsequent learning. Furthermore, 
there is reason to believe, from the results of 
educational research, that faster learners are 
the better performers. This would lend ad- 
ditional support to the meaningfulness of 
time in training as a criterion of success. 

Finally, time in training is likely to possess 
the characteristics demanded of an acceptable 
criterion. It is likely to be relevant, since 
what is learned usually bears on what is to be 
done; it is likely to be comparable among in- 
dividuals, in stable training programs; it is 
likely to be reliable, if careful and repeated 
evaluations are made to determine whether 
the student should be permitted to progress; 
and it is likely to be practical, since informa- 
tion of this type is readily accessible. 

1 The opinions expressed are those of the author 


and are not to be construed as being official or in 
any way representative of the U. S. Navy. 
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In a validation study of the Navy Radio 
Code Aptitude Test, time in training was in- 
cluded with two other commonly used cri- 
terion measures—the average of a series of 
progress tests and a final proficiency exami- 
nation. The results of this study are pre- 
sented below to show the interrelationships 
among the three criteria. 


The Study 


The Navy Radio Code Aptitude Test 
(NRCAT) is a selection test for the basic 
course in Radioman School, the primary goal 
of which is instruction in Morse code. The 
test includes an initial training section during 
which three letters in code are learned. The 
test itself involves the copying of these letters 
at an increasing rate of speed. All recruits 
admitted to Radioman School must qualify 
on this test. Students coming from the fleet 
are not required to take the NRCAT. 

Unless dropped from the course, all stu- 
dents must remain in school for a minimum of 
16 weeks. At the end of each week (after 
the third week), they are given a radio code 
progress test to determine whether they are 
able to advance with their present class. If 
a student falls two or more words behind the 
standard speed set for his class for a particu- 
lar week, he is set back one class, represent- 
ing two weeks of additional training. Stu- 
dents may be set back as many as three 
times, depending upon the judgment of the 
school board as to their salvageability. Stu- 
dents who are able to progress faster than the 
rest of their class are permitted to practice 
code at advanced speeds and to take weekly 
tests at these speeds. This also applies to in- 
dividuals entering the course with some ability 
to receive Morse code. Thus, individual dif- 
ferences in time required to qualify at the 
graduation standard of 20 words per minute 
range from 3 to 22 weeks. 

At the end of the week preceding gradua- 
tion, normally the fifteenth week, a final 
standardized radio code examination is given 
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at 20 words per minute. This test was de- 
veloped, primarily, as a research instrument, 
and alternate form reliabilities among the 
several forms range from .87 to .92. 

Intercorrelations among the NRCAT, av- 
erage of the weekly progress tests, the final 
proficiency test, and time in weeks to reach 
20 words per minute are presented in Table 1 
for three groups of students entering the 
Radioman course. 

Group A is a sample of 139 recruits, with 
those indicating civilian experience or fa- 
miliarity with Morse code eliminated from 
the analysis. Data for this group are pre- 
sented in an earlier report (1). Group B 
consists of 118 recruits, typical of those en- 
tering the course, with 6.1% indicating they 
could receive code at about 5 words per min- 
ute and 3.3% at 10 words per minute or 
greater. Group C had come from shipboard 
assignments, with 58.3% reporting some radio 
experience aboard ship. Of the entire group, 
23.5% stated they could receive code at about 
5 words per minute, and 19.6% at 10 words 
per minute or greater. 

First, it may be noted that the aptitude 
and criteria scores for these groups show the 
expected trend. The more experienced the 
group when entering the course, the higher 
the NRCAT and criterion scores. Since 
Bartlett’s test indicates significant differences 
among the variances for both the NRCAT 
and criterion measures, the conventional F 
test for significance of differences among the 
means may not properly be made. Neverthe- 
less, the obtained differences in time in train- 
ing between groups, ranging from one week 
to two and three-quarter weeks, are of prac- 
tical significance, and are of such magnitude 
as to suggest a probable statistical signifi- 
cance. 

Secondly, for all three groups, the NRCAT 
correlates higher with time in training than 
with either of the other proficiency measures. 
Thus, time to achieve 20 words per minute is 
more predictable by the best available apti- 
tude measure than either of the other criteria. 

Third, time in training is substantially re- 
lated to weekly grades, as expected, since fail- 
ing weekly grades result in the student being 
set back. 


Time in 
Training 


Group C (143 fleet) 
Final 
Code 


Weekly 


NRCAT Grade 





Final Timein 
Code Training 


Weekly 


Group B (118 recruits) 
NRCAT_ Grade 





Time in 
Training 


Intercorrelations Among the Navy Radio Code Aptitude Test and Three Criteria 
Final 
Code** 


Weekly 


Group A (139 recruits) 
NRCAT Grade** 





** The Weekly Grade and Final Code scores for Group A involve different units of measurement from Groups B and C. 


* Significant at the .05 level; all other r’s are significant at or beyond the .01 level. 














Weekly Grade 





Time in Training 


Fourth, time in training is substantially re- 
lated to proficiency at the end of the course 
at the standard speed required for graduation. 
Thus, students who attain the required speed 
earlier are superior performers at that speed 


at the end of the course. A similar relation- 


ship between time in training and flight pro- 
ficiency was reported by Rosenberg (2) for 
a group of Naval Aviation cadets. 

Finally, it might be indicated that none of 
the validities reported reflects the “true” va- 
lidity of the NRCAT, but rather the predic- 
tive efficiency of the aptitude test under three 
operating conditions. The correlations for all 
groups are attenuated by the elimination of 
drops from the analysis (13.9, 34.5, and 
18.7% for Groups A, B, and C, respectively). 
The correlations for Groups A and B are 
further attenuated by the operational use of 
the NRCAT as a selection test. Since the 


NRCAT has a mean of 50 and a standard 
deviation of 10 for an unselected recruit popu- 
lation, the magnitude of the attenuation may 
be judged. Furthermore, the NRCAT does 
not function as a “pure” aptitude test for 
Group B, and especially for Group C, if the 


concept of “an equal opportunity to acquire 
the skill or knowledge” is applied. It is ob- 
vious that the training section of the NRCAT 
does not present an equal opportunity for the 
naive individual and the person with experi- 
ence in Morse code to learn the thrve letters 
in the test. 


Conclusions 


In its present application, time in training 
appears to be a highly meaningful criterion 
measure. It reflects directly the cost of train- 
ing an individual; it is related to his profi- 
ciency level at the end of training; it is pre- 
dictable by the best selection test available. 
In these respects it performs better than such 
conventionally used measures as average 
weekly progress tests or a final proficiency 
test. 

Time in training is most meaningful when 
the trainee is permitted to pursue the acqui- 
sition of particular skills to a required level 
at his own rate, and when his progress is 
periodically and carefully evaluated in these 
skills. This may occur within the framework 
of a formalized training program or in on- 
the-job training where instructional differ- 
ences are not significant. This measure would 
probably merit wider application than indi- 
cated by its infrequent appearance in the ap- 
plied literature. 


Received September 20, 1954. 
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Although keysets are used on a great va- 
riety of machine devices—computers, coding 
devices, and communications equipment— 
there appear to be few systematic studies con- 
cerned with the design factors that make key- 
sets easy or hard to use. The study reported 
here deals with one aspect of keyset design, 
viz., the locations of numbers and letters on 
individual keys. ,In addition, we are con- 
cerned here with a particular class of keysets 
—ten-button sets used by long-distance tele- 
phone operators—but the results probably can 
be generalized to other practical situations. 

In making long-distance calls, telephone 
operators use a set of ten keys, arranged in 
two vertical rows of five, with letters and 
numbers on the keys as shown in Fig. 1. 

To complete a call, the operator usually 
keys a letter-number combination which looks 
like this: 

815 RE 4—0267 


The patterns of errors made by operators sug- 
gest that a different arrangement of the let- 
ters and numbers on the keys, or of the keys 
themselves, might help to reduce errors. As a 
first step in the determination of the best ar- 
rangement of the keys and of the letters and 
numbers on them, we decided to find out 


®) © 


Fic. 1. Arrangement of letters and numbers on a 


toll operator’s keyset. 


1 This study was done at the Bell Telephone Labo- 
ratories. 


where people say they would expect to find 
letters and numbers on six different keyset 
configurations, only one of which resembles 
the present set (see Fig. 3). 

This is not an unusual approach in psy- 
chology. There are studies (1, 2) which show 
that learning is more rapid and errors are 
fewer for tasks in which the stimuli and re- 
quired responses are in an “expected” relation 
than in those where they are not. If people 
have definite expectancies about the locations 
of numbers and letters on keysets, this would 
provide some rationale for the selection of 
particular keysets to be used in further op- 
erational tests. 

The specific problem investigated had three 
parts: 


1. Where do people expect to find numbers 
on each of six configurations of ten keys? 

2. Where do people expect to find letters 
on each of six configurations of ten keys? 

3. Where do people expect to find letters 
on each of six configurations of ten keys, given 
certain preferred number arrangements al- 
ready on the keys? 


Method 


Subjects. The subjects for this experiment were 
classified according to (a) age, (b) sex, (c) previ- 
ous experience on keysets such as appear on com- 
puting machines, typewriters, and musical instru- 
ments. Three hundred Ss were used, one hundred 
to answer each of the three questions, each one hun- 
dred chosen as in Table 1. 

Test Materials. The test materials consisted of 
booklets containing circles arranged in each of the 
six configurations shown in the top row of Fig. 3. 
Each configuration appeared on a separate page. In 
Part I, a random arrangement of the digits 0 to 9 
was printed on the page opposite each configuration 
of circles. In Parts II and III a random arrange- 
ment of the alphabet (except the letters Q and Z) 
was printed on the page opposite each configuration. 
For Part III only the booklets used configurations 
with numbers already printed in the circles (see 
Fig. 2). The numbering arrangements selected were 
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Table 1 


Each of the Three Groups of 100 Subjects Was Com- 
posed of Sampling Subgroups Containing 
the Frequencies Shown Here 








Men 
20-30 30-40 Above 


Women 


20-30 30-40 Above 





Age 


Naive 8 
Experienced 8 





some of the most commonly chosen ones, as deter- 
mined in Part I. 

For all parts, each configuration of keys appeared 
first in a set of one hundred booklets either six or 
seven times. Also for each group of one hundred 
booklets, there were equal, or as nearly equal as pos- 
sible, numbers of the sets of letters (or numbers) 
beginning with each letter (or number). Each num- 
ber or letter beginning a set appeared opposite each 
configuration of keys an equal number of times. 

Procedure. The experimenter read to each S in- 
structions for filling out the booklets. In Part I the 
instructions told the subject to: (a) Put only one 
number on each circle. (b) Place the first number 
appearing in the list (opposite the circles) on a 
circle first, the second number appearing in the list 
on a circle next, etc. (c) Place the numbers on the 
circles in the order in which he would like to use 
them if he had to key numbers all day long. The 
instructions particularly stressed that the S -should 
forget about any keyset arrangements which he may 
have seen in the past, and that the numbers (or let- 
ters) should be placed on the keys where he would 
expect to find them, in the most natural arrange- 
ment. (d) Fill the first configuration of circles first, 
then go to the next one. 

In Part II the instructions were the same as those 
for Part I except: (a) The instructions did not 
specify the number of letters that the subject should 
place on one circle. (6) All instructions were given 
in terms of letters, rather than numbers. In Part 
III the instructions were the same as those in Part II 
except that the subjects were told that numbers al- 
ready had been placed on the keyset. 

Subjects were given unlimited time, and were al- 
lowed to erase previous selections if they changed 
their minds part way through a page. 


Results 


Part I, The most frequentiy chosen num- 
ber arrangements are shown in Fig. 3. The 
outstanding feature of the first choices is that 
the numbers are placed in order in horizontal 
rows, beginning with the top row. In the ar- 
rangements where there is an “extra” circle, 
the zero is placed in that circle. In the ar- 
rangements where there is no extra circle, the 


O@® OO 
OOO OOO 


Fic. 2. Configurations and number arrangements 


tested in Part III. 


0 always follows the 9; it never precedes 
the 1. 

The second most frequent kind of choice 
generally has numbers in vertical rows, in- 
creasing from top to bottom. Numbering 
plans which arrange the numbers in _hori- 
zontal rows, starting with the bottom row, 
are next most common. 

Chi-square tests show that: (a) There were 
no statistically significant differences between 
the age groups in numbering arrangements 
for any of the six configurations. (b) There 
were no statistically significant differences be- 
tween naive and experienced subjects in num- 
ber arrangements for any of the six configura- 
tions. (c) There were statistically significant 
differences (at .01 > p > .001) between men 
and women in numbering arrangements for 
each of the four configurations containing 
three rows of circles with one extra circle 
(the men chose the preferred arrangements 
more often than did the women); there were 
no significant differences between the sexes on 
the other two configurations of keys. 

Part II. The most frequently chosen let- 
tering system agreed with the most common 
numbering system (see Fig. 3), i.e., the let- 
ters were placed in order in horizontal rows 
beginning with the top row. No matter what 
the configuration of keys, this lettering sys- 
tem was chosen by approximately one-third 
of the one hundred subjects. There were usu- 
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Fic. 3. First three choices by frequency for number arrangements on each of the six configurations tested 
in Part I. 


ally two or three letters in order on each cir- 
cle. In the configurations where the “extra” 
circle is at the top, the first letters were placed 
in that circle; if the “extra” circle is at the 
bottom, the last letters were placed in that 
circle. If the “extra” circle is to one side of 
the others, letters were either placed in it fill- 
ing it out as if it were an ordinary part of the 
horizontal row in which it lies, or the last 
letters were placed in it. 

There were no statistically significant dif- 
ferences in lettering arrangements on any of 
the six configurations between (a) men and 
women, (5) naive and experienced subjects, 
(c) subjects in the three age groups. 

Part III. Three of the configurations used 
in Part III (a, 6, and c in Fig. 2) contain the 
number arrangements chosen most often by 
the Part I group for each of the three major 
classes of keyset design (that is, two vertical 
rows of five keys, two horizontal rows of five 
keys, and a 3 X 3 configuration with an odd 
key). For these three configurations, the 


most frequently chosen lettering system was 
the same as found in Part II, i.e., the letters 
were placed in order in horizontal rows, be- 
ginning with the top row. This lettering sys- 
tem was chosen for each arrangement by ap- 
proximately one-half of the total group. 

There were no significant differences be- 
tween the choices of (a) men and women, 
(6) naive and experienced subjects, (c) the 
different age groups. 

The other three configurations used in Part 
III (d, e, and f in Fig. 2) were selected be- 
cause they represent some numbering arrange- 
ments which were chosen with fairly high fre- 
quency in Part I but which are markedly 
different from the most commonly selected 
ones. For these three configurations, the 
group was almost equally divided between a 
lettering system which followed the number 
arrangement on the keys (putting two or 
three letters in order on each key) and the 
lettering system which appeared most fre- 
quently on the other configurations in Part 
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III (in order in horizontal rows, beginning 
with the top row). Each of these alternative 
systems was chosen for each configuration by 
approximately one-third of the group. 

According to chi-square tests, there were no 
significant differences between lettering ar- 
rangements chosen by (@) men and women, 
(6) naive and experienced subjects, (c) dif- 
ferent age groups of subjects. 

An incidental finding was that no subject 
arranged the numbers or the letters as they 
are now arranged on the keyset. 


Discussion 


The most obvious finding of this study is 
that people arrange numbers and letters in 
the order in which they normally read. This 
is true for all configurations of keys, for both 
numbers and letters, and for both experienced 
and inexperienced subjects. Of the several 
calculating devices we have been able to look 
at, only one (the IBM card punch) uses an 
arrangement which is highly preferred in this 
study. It resembles the pattern illustrated as 
the fourth from the left in the top row of 
Fig. 3. Two other calculators have keysets 
resembling the third from the left in the 
second row of Fig. 3. These are the multi- 
plier keys on the Friden calculator and the 
keyset of the Remington Rand adding ma- 
chine. Most other calculators have their keys 
reading upward in vertical rows of ten. In 
all of these, the “O” customarily is placed be- 
low the “1,” a placement which was seldom 
found in our tests. 

Of course, we have no assurance that these 
differences between calculator keysets are of 
any practical importance. In fact, our study 
of expectancies can be considered only a first 
step toward finding the keyset which would 
give the fewest errors and shortest keying 
times. The next logical step would be to 
test some most-frequently chosen letter and 
number arrangements against some of the 
least-frequently chosen ones in an actual key- 
ing situation. 

A final word is in order concerning the 
greater consistency the men showed in choos- 
ing number arrangements. We believe that 
this may reflect a difference in experience be- 
tween the two groups which was not taken 
into account in selecting our’ samples. Virtu- 
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ally all of the men had technical training and 
were either engineers or their assistants, while 
most of the women were typists, clerical help, 
or maintenance people unexposed to any sort 
of technical training. Thus even our “inex- 
perienced” men undoubtedly had greater fa- 
miliarity with number systems and the -ar- 
rangement of numbers on graphs and other 
displays than the corresponding group of 
women. 


Summary and Conclusions 


This experiment attempted to find out 
where people expect to find letters and num- 
bers on each of six configurations of ten keys 
each. Three hundred Ss, stratified according 
to age, sex, and previous experience on key- 
sets, were asked to write on diagrams of key- 
sets either numbers or letters in the arrange- 
ments that they felt were most natural. Our 
results show: 


1. People expect to find numbers on key- 
sets arranged in left-to-right order in hori- 
zontal rows starting with the top row, for all 
of the six configurations of keys. 

2. People expect to find letters on the key- 
set arranged in left-to-right order, with two 
or three letters in order on each key, in hori- 
zontal rows, starting with the top row, for all 
of the six configurations of keys tested. 

3. With numbers already on the keyset: 
(a) People expect to find the letters arranged 
in horizontal rows, beginning with the top 
row, for those patterns in which the numbers 
are arranged that way. (5) When the num- 
bers are arranged in patterns not having num- 
bers in order in horizontal rows (beginning 
with the top row), people are equally divided 
between lettering arrangements following the 
numbering pattern and lettering as in (a) 
above even though this conflicts with the 
numbering arrangement. 


Received September 23, 1954. 
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Various investigators have used tests for 
prognosticating driving aptitude and ability. 
Early contributions by Snow (8), Slocombe 
and Bingham (7), Weiss and Lauer (10), 
Allgaier and Lauer (1), and others have ex- 
plored the field and suggested possibilities for 
improving the predictive value of such tests. 
Later studies by De Silva (2), Kraft and 
Forbes (4), Wilson (11), Ghiselli and Brown 
(3) have furthered the progress of this spe- 
cial phase of industrial psychology. 

Until recently no one had established a 
satisfactory criterion of driving ability. Ac- 
cidents are unreliable and no other criterion 
was found to do the job. Allgaier and Lauer 
(1) found ratings a more satisfactory cri- 
terion of performance than accident records. 

Uhlaner, Van Steenberg, and Goldstein (8) 
found a reliable criterion of driving aptitude 
could be established by ratings of driver re- 
actions to specific situations and a check list 
of driving habits. Their instrument utilized 
the ratings of associates and supervisors. 

With motorization of the Armed Forces a 
better classification and selection system was 
needed to facilitate the important work of 
transportation. A battery of individual driv- 
ing tests adopted at the beginning of World 
War II has been widely used, but these tests 
have certain limitations, viz.: (a) they are 
impractical to administer as group tests, (6) 
the equipment is more expensive, and (c) the 
examiners need considerable special training 
not ordinarily given even in college courses. 

In a contract research with TAGO, Depart- 
ment of Army, a study was carried out to 
determine whether tests covering postulated 
functions of driving aptitude could be as- 


1In part this report is based on contract research 
carried out under the auspices of Department of 
Army, Personnel Research Branch, TAGO. Discus- 
sion of the portions of this study based on the con- 
tract research presented in this paper represents the 
opinion of the author and does not necessarily re- 
flect the view of the Department of the Army. 
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sembled that would improve the classification 
efficiency for Army driving personnel. 

The hypothesis set up for investigation may 
be stated as follows: Driving aptitude can be 
measured by paper-and-pencil tests and these 
are more predictive of the criterion of driving 
performance than psychophysical tests. 


Design and Method of Procedure 


The design of the problem was essentially a three- 
phase validation study. 

The first stage was a preliminary run in which 
various test materials were developed or assembled 
to measure the following basic postulated functions 
of driving ability: 

1. Physical activity as indicated by measures of 
motility. 

2. Gross eye-hand coordination, or macroscopic 
dexterity. 

3. Fine eye-band coordination (microscopic dex- 
terity or control of involuntary movements). 

4. Speed and accuracy in visual perception. 

5. Skill in judging spatial relations. 

6. Visual memory. 

7. Judgmental factors related to automotive ma- 
nipulation. 

8. Personality factors as they relate to driving. 

9. Knowledge of the principles of driving, driving 
regulations, and vehicle potentialities. 

Reliabilities were established by suitable techniques 
to fit the respective instruments used. Evaluation of 
predictors was made by correlating the raw test 
scores against the standard scores of the composited 
criterion and by item analysis of the instruments to 
which this method was applicable. The. criterion 
score was computed by the following formula: 


Criterion = R+H, 
where 


R= (85 =) 20 + 100, and 
oXr 


H= (25**) 10 + 50 
oXn 


Legend: R= standard score on sum of four scale 

portion ratings of the criterion. 
H = standard score on habit section of the 

form. 

Xp = raw score on scale portion of criterion. 

Xp = average of raw scores on the scale por- 

tion of ratings. 

standard deviation of the raw scores 

on the four scales uséd. 


oXr 





Measurement of Driving Aptitude 


Table 1 


Best Predictors of Driving from Paper-and-Pencil Tests and Age 








Best 


Predictor ' Rating 


Functions Assumed 


Sampled Validity 





Age 

Attention to detail 
Driver’s SD blank 
Driving know-how 
Emergency judgment 
Two-hand coordination 
Word matching 


(older) 
(high) 
(high) 
(high) 
(high) 
(high) 
(high) 


12 
a 
As” 
as** 
Daeg 
.20** 
is 


(Attention and perception) 
(Personality and background) 
(Driving information) 
(Judgment in driving situation) 
(Motor coordination) 

(Visual acuity) 





* Significant at the 5% level of confidence. 
** Significant at the 1% level of confidence. 


The same legend for H values used apply to the 
formula for H or driving habits. Note that instead 
of using a mean of 100 and a standard deviation of 
20, a mean of 50 and standard deviation of 10 were 
used. This was tantamount to weighting the ratings 
double that of the habits as checked. 

The second phase was divided into two parts. 
Run Two-A was for purposes of establishing the 
validity of tests which gave promise in the pre- 
liminary run. Run Two-B was made for purposes 
of establishing the validity of certain psychophysical 
tests used by the Army and others of similar type. 
Also some tests which were doubtful in the pre- 
liminary run were rechecked. In the third phase or 
cross-validation run the most efficient predictors were 
selected for use from the first two runs. 

Subjects were scheduled for these test administra- 
tions at nine installations and at 18 different times 
in the Second, Fourth, and Fifth Army areas. Some 
installations were used two or more times. In all, 
there were 1,126 subjects used in computing results 
distributed as follows: 


Preliminary Run 
Run Two-A 
Run Two-B 
Run Three 


468 
203 
124 
331 


Total number of cases 1,126 


Results 


In the final run, using 331 subjects, 32 
variables were correlated with the criterion 
and the intercorrelations calculated.2, From 
these results 18 were found to be significant 
at the 5 per cent level of confidence or better. 

Combinations of different tests and varying 
numbers of tests were investigated. Results 
in general show that there was little gain in 
validity when more than seven variables were 


2 A table of these intercorrelations will be furnished 
anyone interested in detailed relationships. 





used. Table 1 presents the tests of the bat- 
tery which was selected on the basis of 
reliability, working time, convenience of ad- 
ministration, uniqueness of domain, and pre- 
dictive efficiency. The seven predictors given 
in Table 1 combined into a multiple gave 
an R of .38. Comparison of validities for 
selected paper-and-pencil tests and psycho- 
physical (individual) tests may be noted by 
comparing data in Tables 1 and 2. 

The nine predictors in Table 2 combined 
into a multiple gave an R of .29. The sig- 
nificant and near-significant validities are all 
in the expected direction with the exception 
of age. The latter relationship is assumed to 
hold only for the range of ages represented by 
the sampling of Army drivers used which 
yielded a mean of 22.5 and an SD of 2.80. 

The psychophysical tests designated as PRT 
565 included the Snellen acuity, color percep- 
tion, distance or depth perception as meas- 
ured by the Howard-Dohlman test, brake re- 
action time, hearing, and score on the Army 
written test. This battery was used as a com- 
posite predictor. The scores on individual 
tests were coded into 10 groups and weighted 
according to a system used by Lauer (5) in 
order to eliminate undue influence of certain 
tests such as color blindness which has al- 
ways been found to show extremely low cor- 
relation with accidents or other criterion in 
driving. This method was considered ade- 
quate for the purpose. Standardization of 
scores to make the composite was deemed an 
unnecessary refinement of these measures. 

The respective validity coefficients for this 
composite in Run Two-B and Run Three were 
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Table 2 
Psychophysical Test Validities 








Predictor Best Ratings 


Functions Assumed Sampled Validity 





Age 
Steadiness 
Field of vision 
Strength 


(older) 

(most steady) 
(equivocal) 
(stronger) 


Motility 
Choice reaction time 


(more active) 
(shorter time) 


(fewer) 
(more consistent) 


Choice reaction time 
Choice reaction time 


Visual acuityt (equivocal) 


(To nearest birthday) 12 

(Control of involuntary movement) .10 

(Lateral field for movement) 

(Grip as measured by the Smedley 
dynamometer) 

(Speed of movement) 

(Reaction to red with green lights 
interspersed at random) 

(False starts on green) 

(Mean difference between successive 
trials) 

(Armed Forces test) 





* Significant at the 5% level of confidence. 
** Significant at the 1% level of confidence. 


t Since the drivers had been doubly screened before being allowed to drive, the acuity level was relatively homogeneous. 
would tend to reduce the validity of visual acuity as a predictor. 


+ .23 and + .12, respectively. It should be 
kept in mind that the composite score in- 
cluded a type of short Driver Know-How. 
The validity of the latter test stood up well 
throughout these studies. To some extent 
this undoubtedly bolstered the validity r’s ob- 
tained with the criterion. 

Thus it will be seen that the paper-and- 
pencil group tests, with age included, measure 
approximately twice the amount of criterion 
variance as measured by the battery of psy- 
chophysical tests with age added. Further 
the paper-and-pencil group tests measured 
twice the variance of that by the combined 
subtests in the PRT 565 battery of predictors 
widely used in measuring Army and civilian 
driver aptitude. 

From the total study three or four alter- 
nate batteries of paper-and-pencil tests were 
developed and cross validated. The efficiency 
of these predictors was found to hold up on 
successive samplings and to predict heavy 
driving performance even better than that of 
light or mixed Army drivers. In a subsequent 
study validities for these tests have been 
found to run around + .45 for heavy drivers. 


Summary and Conclusions 


In a contract research with the Department 
of Army several predictors composed of pa- 
per-and-pencil tests of driving performance 


This 


were developed and validated. Using a three- 
phase validation program with samplings of 
203, 124, and 331 Army drivers, respectively, 
from nine different installations, the multiple 
R for the final battery was of the order .38. 
The criterion used was an instrument devel- 
oped by TAGO Army Ratings for Drivers, 
PRT 2408. 

Comparable size batteries of individual psy- 
chophysical tests were found to yield validi- 
ties around .25, and a composite of PRT 565 
predictors was found to have validities of 
about .15, using the same criterion. 

Since most of the paper-and-pencil tests 
can be administered to groups of possibly 100 
recruits or more at a time and measure at 
least twice the criterion variance as that 
measured by individual psychophysical tests, 
the increase in general efficiency is readily 
shown from a field study of 1,126 Army 
drivers. 

Subject to the limitations of the study as 
described, and other possible influences, the 
following conclusions may be tentatively 
stated: 


1. The hypothesis that a properly selected 
battery of group paper-and-pencil tests of 
driving aptitude is a better predictor of driv- 
ing ability than conventional psychophysical 
tests was confirmed. 
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2. A battery of group tests was developed 
to show a validity of .38. 

3. The reliabilities of these tests were com- 
parable with those of standardized commer- 
cial tests widely used for evaluation purposes. 

4. A completely new approach must be 
made in the measurement of driver aptitudes 
by so-called psychophysical tests adminis- 
tered individually if the efficiency of driver 
selection is to be raised. It would seem that 
a complete revision is warranted. 

5. The predictive efficiency for Army driv- 
ing aptitude tests described is nearly equal 
to that of intelligence and aptitude tests used 
for prognosticating scholastic aptitude in 
schools and colleges. 


Received August 27, 1954. 
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Most modern industrial jobs are repetitive 
and alleged to be uncreative. Observers of 
the industrial scene have been greatly con- 
cerned about the consequent feelings of mo- 
notony and boredom by the workers. Super- 
ficially, it would seem that repetition in work 
would be a cause of boredom, that work which 
appeared repetitive to the observer would 
necessarily be accompanied by boredom, and 
work with apparent variety by absence of 
boredom. Industrial investigations estab- 
lished very early, however, that jobs with all 
the appearance of being repetitive were not 
always considered monotonous by the work- 
ers (e.g., 4, 8, 11, 21). Investigations of 
clerical workers, school teachers, and profes- 
sional workers, on the other hand, have re- 
peatedly indicated that many persons find 
each of these more varied kinds of work bor- 
ing. (For summaries of these studies, see 
Hoppock and Robinson [6] or Viteles [17].) 

An observer of a job may classify it as 
repetitive solely on the basis of the observed 
frequency of repetition of the task. This 
type of classification does not take into ac- 
count the perceptions of the worker. Repe- 
tition for the worker depends upon what he 
perceives in the task, and his perceptions are 
not subject to immediate scrutiny by an ob- 
server. For instance, if a worker perceives 
variety in the minute changes of detail or in 
the social situation around him, the job is 
for him one in which there is variety. Repe- 
tition as defined by externally observable fre- 
quency of occurrence cannot be stated as a 
valid cause of monotony. Repetition is rather 
one characteristic of the task as perceived by 
the worker—one aspect of the experience of 
monotony. 


1This paper is a portion of a dissertation pre- 
sented in partial fulfillment of the requirements of 
the Ph.D. degree at Cornell University, 1942. The 
writer is deeply indebted to Dr. T. A. Ryan for his 
guidance, and to the management, union officials, 
and workers whose active cooperation and _ assist- 
ance made the study possible. 
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These considerations enable us to define 
our terms, at least in a general way. For the 
purposes of this discussion, we shall use the 
terms monotony and boredom interchange- 
ably to designate the experience which arises 
from the continued performance of an ac- 
tivity which is perceived as either uniform or 
repetitive, and which also induces a desire 
for change or variety. This definition, ob- 
viously, restricts monotony to the experience 
of the individual. It has frequently been 
suggested or assumed that there are indi- 
vidual differences in susceptibility to such ex- 
periences (9, 11, 21). This study was under- 
taken to investigate factors in the individual 
which might predispose him to experiences of 
boredom. 


Procedure 


The research was conducted in a small knitwear 
mill in northern Pennsylvania. Most operators in 
the plant, and all included in this study, were paid 
by piece rate. The active support of both the plant 
manager and the business agent of the union was 
a major factor in securing the confidence of the 
workers. 

Although feelings of boredom are most directly 
assessed by verbal reports of the workers, we con- 
ducted a preliminary study in an attempt to obtain 
more objective supplementary criteria. As reported 
previously (14), we found, however, that such in- 
direct indicators of boredom as talking, frequency of 
rest pauses, average working speed, and the shape 
and variability of the output curve were both un- 
reliable and invalid in this situation. Verbal report 
was the only available criterion. 

Detailed interviews with a number of workers laid 
the groundwork for the main study, orienting the 
investigator and aiding the union and management 
in “selling” the workers on the desirability of the 
study. The subjects of the main study were 72 
women workers, all engaged in light, repetitive work. 
We included only sewing-machine operators who 
had been on the job three months or more, and who 
remained continuously on the same task throughout 
each working day. The questionnaire form was 
used rather than interviews, both because it per- 
mitted contact with a larger number of workers and 
because any later application of the findings would 
be much more practical if information could be 
gathered in paper-and-pencil form. All questions 
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were pretested and, when necessary, revised before 
administration to the main group. Questionnaires 
were unsigned. 

Some criterion questions concerning the experience 
of monotony were adapted from the much-quoted 
studies of Wyatt, Langdon, and Stock (21); others 
were suggested by the interviews. Included were 
such obvious items as “Do you often get bored with 
your work?,” “Is your job too monotonous?,” 
“Would you like to change from one type of work 
to another from time to time if the pay remained 
the same?,” and some similar multiple-choice and 
completion items. The frequencies of choice of an- 
swers to each question were compared for the entire 
group, by the chi-square technique, and a weighted 
criterion score was devised on the basis of those 
items most closely agreeing with each other. Sub- 
jects were then separated into three approximately 
equal groups—the nonsusceptible, the susceptible, 
and an intermediate group of workers. This cri- 
terion seemed superior to the criteria used by most 
previous investigators, who relied either upon shape 
of the production curve or on answers to questions 
about preference for regularity in daily habits out- 


Table 1 


Criterion Questions and Weighting of Each 








Question Answer Weight 





Do you often get bored with your work? Yes 1 
? 0 
No —} 


Is your job too monotonous? 2 
0 
—2 


Would you like to change from one type - 1 
of work to another fromtimetotime,  * 0 
if the pay remained the same? —1 


Would you like to be a forelady? 


What time of day seems most boring to you? 
Choice of any hour from 7 A.M. to3 P.M. 
Choice of hour from 3 P.M. to4 P.M. 
Choice of any hour outside of working hours 


How well do you like the work that you do? 
I think that it is extremely monotonous 
I think that it is very monotonous 
I think that it is pretty monotonous 
I think that it is not very interesting 
T think that it is pretty interesting 
I think that it is very interesting 
I think that it is fascinating 


Is there anything about the work which you 
particularly dislike? 
It is too monotonous 
Other responses 
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side the work situation (3, 21, 11, 15). 
shows the criterion questions. 

The worker who does not suffer from monotony 
when doing repetitive work has been portrayed in 
the literature as an inferior, insensitive sort of per- 
son—placid, extraverted, happy (10, 15, 21), unable 
to daydream (21), uncreative (18, 21), and, above 
all, unintelligent (2, 7, 16, 18, 19, 20, 21, but not 
15). Such portrayals in textbooks and in journals 
are distinguished more for their literary than for 
their scientific value. Nevertheless, they furnished 
us with a number of hypotheses concerning charac- 
teristics related to susceptibility to monotony. Fur- 
ther hypotheses were formulated during the pre- 
liminary interviewing and _ observational periods. 
Questions designed to test each of these hypotheses 
were included in the questionnaire, and answers to 
each question were compared for the three criterion 
groups. Significance of relationships was tested by 
the chi-square test. 

All three criterion groups were included in every 
analysis. Where the response categories were “Yes, 
?, No,” the “?” response was, in most cases, chosen 
too infrequently to make the chi-square technique 
applicable. The interpretation of such responses is 
ambiguous, moreover, meaning for various respond- 
ents, “I don’t know,” “Sometimes one, sometimes the 
other,” “I don’t understand the question,” “I don’t 
wish to answer the question,” or “The question does 
not apply to me.” Usually these responses were 
omitted from the analysis. In a few cases, indicated 
in the tables, the “?” response was chosen suffi- 
ciently often to bring the predicted frequencies up 
to five. In these instances, as in all questions with 
multiple-response categories, the significance was first 
tested for all responses. Responses were then 
grouped into two categories for purposes of com- 
parability. An additional p value is therefore re- 
ported in each case for a 3X2 chi-square table. 
Yates’ correction for continuity was applied through- 
out. 

Table 2 shows the questions, the most frequently 
chosen answers for each of the three groups, and the 
level of significance for each, for all those relation- 
ships found to be significant at the 5 per cent level 
or better and to differentiate between the extreme 
groups. Tables 3-12 ° show this information for all 
questions used. 


Table 1 


Hypotheses and Results 


Hypothesis I. Younger workers are more 
susceptible to monotony than older ones. 
Age was compared with answers to the cri- 
terion questions. The results showed that 
workers under 20 were significantly more sus- 


2 Tables 3-12 have been deposited with the Ameri- 
can Documentation Institute. Order Document 4627 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton 25, D. C., remitting in advance $1.75 for micro- 
film or $2.50 for photocopies. Make checks payable 
to Chief, Photoduplication Service, Library of Con- 
gress. 
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Table 2 
Summary of Items Significantly Related to Monotony Susceptibility 


(Answers to questions and p values derived from the chi-square test of significance of relationships. All 
p’s based on 3 X 2 tables unless otherwise indicated.) 








Answers chosen mos: 
frequently by 





Monot. Middle Non- 
Question Answer Susc. Group _ susc. 





How old are you? Under 20 xX 
20-24 
25-29 
30-34 
35 and over 


Under 25 
25 and over 


How long have you been doing the same work? Less than 3 mo. 
3-6 mo. 
6 mo. and over 


Do you like to daydream at your work around Yes 
the house? ? 


No 


Yes 
? 


Do you especially like to have a definite schedule Yes 
of home duties so that you can do them at the No 
same time every day? 


If you had an evening to spend as you liked, 
what would you usually rather do? (CHECK 
AS MANY AS YOU LIKE) 

Knitf 

Sew 

Visit friends 

Go to the movies 

Listen to the radio 

Go down town 

Read 

Dance 

Drive around in the car 

Something else you would like to do very 
much. (If so, what?__._—SS 


AAK A 


More than average number checked 
Average number or fewer checked xX 





* » based on more than two response categories. 
** Trend disappeared when records of workers only between ages of 20 and 25 were analyzed. 
+ ‘“‘No” was infrequently chosen and omitted from analysis. 

t Item checked by only three workers. 

§ Analyze wdith response checked vs. not checked. 

4, These were almost all quiet domestic activities. 
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Table 2—Continued 





Answers chosen most 
frequently by 





{ . Monot. Middle Non- 
Question Answer Susc. Group __susc. 





When you are working around the house, what 
job do you prefer? (CHECK AS MANY 
AS YOU LIKE) 

Washing dishes 

Dusting 

Cleaning drawers 

Washing clothes 

Scrubbing floors 

Cooking 

Drying dishes 

Cleaning rugs 

Washing windows 

Troning 

Making beds 

Mending 

Something else around the house which you 
like to do very much. If so, what? (_ ) 


AKAMA KAM MMMM 


More than average number checked 
Average number or fewer checked 


Do you often quarrel with anyone in your home? 
Are you anxious to get away from home? 

Do you often quarrel with your mother? 

Do you often quarrel with your father? 


Have you had any other job, either in the mill 
or elsewhere, which you prefer to the one you 
have now? 


Is there anything about the work which you par- 
ticularly dislike? Ifso, what? (_ ) 


What part of the day do you enjoy more, 
the part of the day you spend in the mill? 
the part of the day you spend away from the 
mill? 


How good do you feel that your job is? Check 
the opinion which is most like yours. 
(CHECK ONLY ONE) 

a. I would not stay on my job for a minute 
if I could get something else. It is very 
unpleasant work. 

b. I would really prefer something else, but 
it is all right here. 
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Table 2—Continued 





Answers chosen most 
frequently by 





Monot. 
Susc. 


Middle 
Group 


Non- 


Question Answer susc. 





. I like this job about as well as any job 
which pays the same. 
d. I like this job better than most jobs I 
know about. 
e. I wouldn’t take another job unless it 
paid a good deal more. 
f. I feel that it is the ideal job for me. 


Responses a, b, and c 
Responses d, e, and f 


Do you know of any job you would rather have Yes 


than the one you have, even if the pay were No 
the same? 


Is there anything that could be done to make Yes 


your job more pleasant? 


If so, what? (_ ) No X 008 


Better 


Do you like it (the work) better or worse than . 
Worse xX 02 


when you started? 





ceptible and those over 35 significantly less 
so, although no relationship obtained between 
20 and 35. Three other kinds of personal his- 
tory items were investigated—marital status, 
number of children, and number of years of 
experience at this or similar work—but trends 
in these proved to disappear when age was 
held constant (see Footnote 2, Table 3). 
Age, then, is a correlate of susceptibility in 
this group. 

Hypothesis II. Susceptible workers are 
more ambitious, either for themselves or for 
their children. No item in this list discrimi- 
nated among the groups (see Table 4). Level 
of aspiration seemed not to be related to feel- 
ings of monotony in this sample. 

Hypothesis III, The susceptible worker 
does not daydream. It has been suggested 
that the nonsusceptible worker does not feel 
the monotony because he daydreams. This 
hypothesis was rejected by this study, the an- 
swers showing an insignificant tendency (8% 
level) in the other direction (see Table 5). 
The monotony-susceptible workers tended to 
daydream more, both in and out of the plant. 

Hypothesis IV. The susceptible worker is 
likely to be extraverted. This suggestion is 
related to the preceding one, the extravert 


presumably needing more stimulation from 
his environment. However, none of the 
personality-questionnaire items, intended to 
measure introversion-extraversion, was _ re- 
lated to the criterion in this study (see 
Table 6). 

Hypothesis V. The susceptible worker will 
be more restless in his daily habits and in his 
leisure-time activities. Two kinds of ques- 
tions were included to investigate this possi- 
bility. ; 

First, questions concerning preference for 
definite schedules of home duties, walking the 
same way to school (or work) every day, of 
preferring to stay home during vacations, etc., 
had been used as criterion questions in sev- 
eral earlier studies (11, 15). The relation- 
ship of feelings of boredom at work to pref- 
erence for variety in other situations had 
not been proved, and hence such questions 
scarcely seemed satisfactory as criteria. Pref- 
erence for regularity might prove to be a 
general trait, however, so several such ques- 
tions were included in the questionnaire. The 
results were in the direction of the hypothe- 
sis for eight of the ten questions, two being 
indeterminate (see Table 7). A total of all 
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these items was related to the criterion at 
the 2% level. 

Secondly, leisure-time preferences were in- 
vestigated by two check lists of recreational 
and housework activities, constructed on the 
basis of the preliminary interviews. All but 
four of the 21 activities were checked more 
frequently by the nonsusceptible group, these 
17 including all the housework items and all 
of the recreational activities which could be 
performed sitting down, with the exception of 
reading and driving around in the car, which 
were checked more frequently by the middle 
group. The susceptible group, in contrast, 
preferred dancing or going down town (which 
in this community meant window shopping) 
(see Table 8). The total number of the 
items checked was related to the criterion at 
the 1% level. Since most of these activities 
were settled and routine, we must consider 
the hypothesis quite tenable that susceptible 
workers are more likely to be restless outside 
the plant than less susceptible workers. 

These two groups of items proved not to be 
significantly related to each other when re- 
sponses for workers in a restricted age range 
(20-35) were compared, although each group 
was still significantly related to the criterion. 
Differences in living arrangements and per- 
sonal circumstances may make it possible for 
workers to develop preferences for irregu- 
larity in only limited portions of their lives, 
or preferences may actually be expressions of 
two separate traits. 

Hypothesis VI. The susceptible worker 
will be less satisfied with his personal, home, 
and plant situation in aspects not directly 
concerned with uniformity or repetitiveness. 

Questions concerning personal adjustment 
included a number from various personality 
questionnaires, designed to measure so-called 
neurotic tendencies. Most of these items 
showed no significant relationship to the cri- 
terion (see Table 9). The few differentiating 
items referred not to neuroticism in general, 
but to feelings of persistent depression and 
discouragement, and seemed to concern con- 
tentment more than tendency to show other 
neurotic symptoms. The distribution of ob- 
tained chi squares does not deviate from 
chance expectancy sufficiently to be sure that 
the relationships between these items and the 
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criterion is not merely fortuitous. The con- 
sistency of kind of item, however, is encour- 
aging. To a certain extent, then, the discon- 
tented persons in this group also found their 
work monotonous. This finding is probably 
related to the results concerning more fre- 
quent daydreaming in the susceptible group. 

Questions concerning home life showed 
clear-cut relationships to the criterion. All 
but one of these items discriminated at better 
than the .05 level of significance, and the to- 
tal of all at the .0002 level (see Table 10). 
Home adjustment is a close correlate of sus- 
ceptibility in this sample. 

General work adjustment, similarly, ap- 
peared to be closely related to complaints of 
monotony. Nine of the eleven questions con- 
cerning working conditions and work atti- 
tudes showed significant relationships at the 
10% level or below, the median of the eleven 
being below the 1% level (see Table 11). 
For the total group, and also for the group 
with ages between 20 and 35, relationships 
among home, personal and work adjustment 
total scores were compared. All were sta- 
tistically significant (p less than 5% level). 
(Extremely unfavorable comments concerning 
factory conditions and supervision occurred 
frequently in all criterion groups. Differ- 
ences in frankness do not, therefore, account 
for the relationships found among the various 
kinds of dissatisfaction.) The overlapping 
was very great here; either feelings of mo- 
notony color all of the attitudes of the work- 
ers toward their families, personal lives, and 
work, or these feelings are a reflection of a 
general dissatisfaction. 

Hypothesis VII, The monotony-susceptible 
workers are more intelligent than the others. 
Because of the correlation between intelli- 
gence and educational level in such groups, 
and because information concerning educa- 
tional level could be readily obtained with- 
out sacrificing anonymity, educational back- 
ground (see Table 12) was compared with 
reported monotony for the entire group. 
Education ranged from fourth grade to two 
years of college. No significant relationship 
appeared with the criterion,® although there 


3 In an attempt to assess intelligence more directly, 
a group test was given to workers from the extreme 
categories. Turnover, transfer, and desire to retain 
anonymity reduced the groups to eight susceptible 
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was a slight tendency toward higher educa- 
tional levels in the nonsusceptible group. 

Hypothesis VIII. Feelings of monotony 
are not merely a function of the task per- 
formed, but are related to other more gen- 
eral factors in the individual worker. This 
hypothesis, of course, underlies the others, 
and is clearly tenable in view of the number 
of personal correlates demonstrated in this 
sample. 

In summary, four kinds of items were to a 
significant degree negatively related to re- 
ports of monotony in this group: 


1. Age, which was related to the other per- 
sonal history items and to some extent to the 
adjustment or contentment score. 

2. Preference for regularity in daily routine. 

3. Lack of restlessness as expressed in pref- 
erence for inactive leisure-time activities. 

4. Satisfaction in personal, home, and fac- 
tory life. 

For the homogeneous age group (20-35), 
combined scores for these last three groups 
of items concerning preference for regularity, 
restlessness, and satisfaction proved to have 
no significant correlation with one another, 


when tested by the chi-square test (p’s = .52, 


.43, and .19). A combined weighted score of 
these factors clearly separated the criterion 
groups. Still better separation was achieved 
by the use of minimum cutting scores based 
on a pattern analysis. 


Discussion 


Cross validation, with employees tested be- 
fore employment, is obviously necessary in a 
study of this kind, despite the relative speci- 
ficity of the predictions, to establish gener- 
ality of relationships and direction of causa- 
tion. Direct repetition of this study so far 
has been impossible. Confirmation of the re- 
sults has, however, come from several sources. 

First of all, the negative finding concerning 
intelligence has considerable support. One 
early study, that of Thompson (14), found 
a slight inverse relationship between intelli- 
gence and susceptibility to uniformity. Even 
the classic study of Wyatt et al. (21) re- 


and 13 nonsusceptible workers, too few, of course, 
for drawing of broad generalizations. The median 
and mean, however, were higher for the nonsus- 
ceptible than for the susceptible groups, confirming 
the results obtained with education. 
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ported such small differences between the 
most and least bored groups that in only 
four of the ten comparisons were they sta- 
tistically significant. Other studies (e.g., 7) 
were based on turnover, rather than job satis- 
faction, and results might well be attributable 
to more able persons securing better jobs 
than less able. More recently, Heron (5) 
has reported results consistent with those of 
the present study, using male unskilled work- 
ers in England. His criterion of job adjust- 
ment (a special rating by supervisors) proved, 
when age and experience were partialled out, 
not to be predicted by “General Mental 
Ability.” The relationship between intelli- 
gence and boredom is by no means estab- 
lished. 

The positive findings concerning prefer- 
ences, personality characteristics, and age 
have received support from several sources. 
Five studies have been completed by the 
writer, using women workers in garment fac- 
tories in various sections of the country, and 
comparing answers at the time of employ- 
ment with later absences and with length of 
service. Results have been very consistent 
for age at time of employment, it being nega- 
tively related to absences and positively re- 
lated to subsequent length of service up to 
the age of 45. Preference for regularity and 
lack of restlessness items predicted only ab- 
sence rate, and only in the two studies in 
which the towns had populations of over 
25,000. The satisfaction items predicted both 
absence and length of service in all situa- 
tions, but to highly variable extents, possibly 
reflecting differences in the tendencies of ap- 
plicants in various situations to fake such 
items. Even this degree of consistency is 
encouraging, however. 

Other investigators have reported support- 
ing evidence concerning the generality of the 
traits involved. Pierce (12), using college 
students, showed a relationship between poor 
scores on a modification of, the home adjust- 
ment items and flexibility as measured by 
Luchins’ Einstellung test. Bews (1) simi- 
larly for college students showed a relation- 
ship of poor home adjustment scores to sus- 
ceptibility to satiation in laboratory tasks. 
Heron’s results are remarkably consistent (5). 
In addition to the negative finding concern- 
ing intelligence, he reported a positive rela- 
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tionship between job maladjustment and 
“Emotional Instability’ (which included 
“many Worries”), but mo relationship with 
“Neurotic Extraversion” (“Hysteric Tend- 
ency”). A fourth factor, “Speed of Ap- 
proach,” not directly comparable to any in 
the present study, showed low predictive 
value. 

The picture which emerges from _ these 
studies of the personality of the person who 
is satisfied doing repetitive work is one of 
contentment with the existing state of affairs, 
placidity, and perhaps rigidity. His satisfac- 
tion would seem to be more a matter of close 
contact with and acceptance of reality than 
of stupidity or insensitivity. 

Since the preference for uniformity in work 
extends into daily habits outside the work 
situation, is related to lack of conflict or re- 
bellion in the home, and is correlated with 
contentment both in the factory and out, feel- 
ings of monotony seem to be symptomatic of 
other discontent and restlessness rather than 
specific to any particular task. 


Summary and Conclusions 


Responses to questions concerning feelings 
of monotony and boredom on the job were 
compared, for a group of 72 women, with an- 
swers to other questions designed to test hy- 
potheses derived primarily from accounts of 
previous writers concerning the personal char- 
acteristics associated with susceptibility to 
monotony. Four hypotheses were not sup- 
ported in this study: that the susceptible 
worker is more ambitious, tends not to day- 
dream, is extraverted, and is more intelligent. 
Three remained tenable: that the susceptible 
worker is likely to be young, restless in his 
daily habits and leisure-time activities, and 
less satisfied with personal, home, and plant 
situations in aspects not directly concerned 
with uniformity or repetitiveness. 

On the basis of this and confirming evi- 
dence, an eighth hypothesis was considered 
tenable: that feelings of monotony are not 
merely a function of the task performed, but 
are related to more general factors in the in- 
dividual worker. It was suggested that satis- 
faction with repetitive work does not neces- 
sarily reflect insensitivity and stupidity, as 
the more romantic textbooks seem to imply. 


Received September 27, 1954. 
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Modern technology has produced a large 
number of very simple and highly repetitive 
motor tasks. This technology has also de- 
vised many ways by which workers can be 
motivated to perform these tasks by the in- 
troduction of incentives which are not psy- 
chologically a part of the tasks themselves. 
Consequently, one might assume that the av- 
erage operator finds little of interest in these 
tasks themselves, and that continued perform- 
ance during any one time period might be due 
solely to incentives, either social or eco- 
nomic, which are external to the task. 

Recently it has been emphasized that there 
are very real internal incentives which de- 
velop as a result of the continued perform- 
ance of a task, even in unskilled repetitive 
work. A British economist, W. Baldamus 
(1, pp. 1-15), has described and labeled one 
such factor “batch traction.” He describes 
batch traction as an involuntary relationship 
between a worker and his task which “mildly 
urges” an operator to work through or to 
finish a bundle, a group of units, or a lot be- 
fore taking a voluntary rest period. Batch 
traction is only one of a general class of posi- 
tive factors in work which he calls traction. 
“Traction is, in a sense, the opposite of dis- 
traction. It is a feeling of being pulled along 
by the inertia inherent in a particular ac- 
tivity. The experience is pleasant, because 
it is associated with a feeling of reduced ef- 
fort. It... is the most characteristic ex- 
perience in industrial occupations” (1, p. 42). 
The concept of a need to complete a task is 
scarcely a new one to psychologists; Balda- 
mus’ comments are unusual in that they de- 
scribe a positive, a pleasant relationship be- 
tween a task and the person who performs it. 

If these mild positive factors do exercise an 


1 The research was completed while the junior au- 
thor was a student at Cornell University 


influence upon workers in repetitive tasks, it 
is clear that they should be maximized to re- 
duce effort and conflict for the worker and 
possibly to make his work more satisfying. 
It is possible, moreover, that eventually such 
decreased effort would be reflected in de- 
creased absence and turnover and even in in- 
creased production. 

From the description Baldamus has given 
of batch-traction effects, we should expect 
them to be greatest when the lots are large 
enough to constitute an easily grasped psy- 
chological unit, and they should reduce as 
lots become so large that the worker has the 
end of a lot in sight for only a small percent- 
age of the time. Reduction of very large lot 
sizes should, then, increase the effects of batch 
traction. 

Several criteria have been used in previous 
studies of the effects of lot sizes, or suggested 
themselves here: 


1. Introspection by educated persons, such 
as Baldamus, who have engaged in repetitive 
work. This evidence may be criticized on the 
grounds that such observations may not be 
typical of those of workers accustomed to 
production work. 

2. Expressed preferences of workers for 
smaller lots when they have been introduced 
in the course of other experiments (9). The 
effects of suggestion on the one hand, or of 
resistance to change on the other, could not 
be even partly controlled in such studies 
without the use of prohibitively long periods 
of experimentation. Both effects seemed 
likely to be important in this situation. Data 
concerning preferences were gathered as sup- 
plements to other criteria in the present study. 

3. The over-all shape of the work curve 
(9). The evidence of the effects of lot size 
based on this index was not considered con- 
clusive, and the index was eliminated from 
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consideration in this case because such curves 
are often found to be unreliable, and judg- 
ment of their shape is highly subjective (7). 

4. An increase in the output toward the 
end of each lot would be a less subjective 
index of traction effects, but in most factory 
situations in which experiments could be 
plausibly carried out, such changes could be 
attributed as well to differences of ease of 
handling work from a nearly empty tray or 
from a nearly full one. This index was also 
eliminated. 

5. Changes in total production with changes 
in lot size. Previous studies (4) using this 
index have been limited because such factors 
as amount of experience on the task (2, 5), 
and fear of work shortages, changes in meth- 
ods, and alternation of jobs (4) were not con- 
trolled. Even when these factors have been 
controlled, however, the changes in total pro- 
duction are satisfactory indices only if work- 
ers are not restricting their production to a 
set figure. These changes, moreover, are al- 
most as susceptible to the effects of sugges- 
tion, positive or negative, as are verbal re- 
ports. Restriction was clearly evident in all 
departments available for this study, as evi- 
denced by extremely small variability from 
time to time and from worker to worker, and 
by verbal reports of foremen and workers. 
Therefore, the relatively mild effects of batch 
traction could not be expected to result in 
production increases except over very long 
periods of time. 

6. Personnel records, such as absence, tar- 
diness, and accident records. These are un- 
suitable because they are unreliable over 
short periods of time. 

7. Frequency of voluntary work stoppages. 
This appeared to be the most flexible, objec- 
tive, and reliable criterion of the effects of 
batch traction. We predicted that workers 
experiencing batch traction would tend to 
wait until they had completed a lot before 
taking a voluntary break, and that workers 
not experiencing traction so strongly would 
tend to take more frequent rest pauses 
throughout each lot. Thus, decrease in fre- 
quency of pauses and its correlate, increase 
in length of time worked between stops, would 
be taken as indices of increases in batch trac- 
tion effects. 


Choice of Operations for Study 


Changes in effects of batch traction should 
be most easily observable in work which is 
light and repetitive, and in which the workers 
are accustomed to very large work lots. The 
social conditions of the departments involved 
should permit rather free variation in spac- 
ing and frequency of rest periods. For more 
general applicability, more than one job 
should be studied, preferably with similar 
cycle times. These conditions were remark- 
ably well satisfied by two jobs performed in 
a small department in a medium-sized metal- 
working firm. 

Three nearly identical filing operations and 
one straddle-milling operation were selected. 
We analyzed the records of all of the opera- 
tors who remained on either job throughout 
the experimental period; there were five on 
filing and four on the milling operation. 

Both operations were light and repetitive, 
were paid by piece rate, and had closely simi- 
lar cycle times. The department which was 
selected permitted the operators to take vol- 
untary rest periods a talmost any time with- 
out management intervention. Both jobs re- 
quired the girls to procure their own work, 
which was normally supplied in tins contain- 
ing 3,100 pieces each and taking about four 
hours for completion. Finished pieces of the 
milling operation were drop-disposed. In the 
filing operations, they were brushed aside on 
the work bench, and placed in the tins after 
a large number had accumulated. 


Experimental Design and Procedure 


The experiment was explained to the workers as 
an investigation of the effects of lot size on fatigue. 
They were aware of the presence of the experimenter 
during all observations. All workers were observed 
continuously from the time of their last scheduled 
rest period until closing time, a period of 105 min- 
utes. Whenever a worker stopped work, the time 
elapsed until resumption of work was recorded.” 
The worker was considered to be on the job only 
when engaged in the primary task of operating the 
machine. The time taken out for filling out pro- 


2Most of the observations were made by the 
junior author. Some were recorded by Mr. Thomas 
Turner, whose help is gratefully acknowledged. Ob- 
server differences were checked; they were usually 
small, and in no case could they have been respon- 
sible for the differences between conditions obtained 
in this experiment. The first observational period 
for each observer was eliminated from the com- 
putations. 
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duction tickets, adjusting and cleaning the machine, 
etc. was included in the time stopped since these op- 
erations were for the most part under the control of 
the operator. The time for these tasks was as- 
sumed to be constant from batch to batch, on the 
basis of the time-study reports from the engineering 
department. Work periods of less than 30 seconds 
were not considered to be work periods for our pur- 
poses, being too short to engender any traction ef- 
fects. These very short work periods were com- 
bined with the adjoining stopped periods to make 
one continuous stoppage. Similarly, times of less 
than 15 minutes between the last stop and the end 
of the day were also eliminated, since the stop at 
closing time was not under the control of the op- 
erator. 

The plan of the experiment was to use three lot 
sizes, the very large size to which the workers were 
accustomed, a small size, and an intermediate one, 
presented in counterbalanced order with periods of 
adaptation preceding the recording of the effects of 
each change. As so often happens in factory ex- 
periments, a few modifications of the plan had to be 
introduced. The actual order is shown in Table 1. 
This order was followed for both tasks. 

In Conditions S and M, the work was counted by 
the foreman or by an hourly-paid worker, into small 
paper containers. Workers could take several of 
these at one time. 

The small lot sizes were omitted in the reverse 
order of conditions because the foreman and some 
workers objected to these conditions on the grounds 
of the time required to change pans. Since this time 
could have amounted, at most, for the small-sized 
lots, to ten minutes per day, according to the time- 
study records, it seemed more likely that the objec- 
tions actually reflected the foreman’s dislike of hav- 
ing to have the work recounted for the experiment. 
Nevertheless, we thought it more important to se- 
cure continued cooperation in the final condition, 
with the large lots, and this second series with small 
lot sizes was omitted. 

After expressing his objections to the reduced lot 
sizes, the foreman actually found them of such slight 
inconvenience that he forgot to restore the customary 
large lot sizes until the morning of the day preced- 
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ing the first observation of the second period of 
large-sized lots. This delay eliminated the planned 
week of re-adaptation. Plant schedules made it im- 
possible to delay further recording. Three periods 
of 105 minutes each were recorded with the large 
lot sizes at this time. 


Results 


Two indexes of change were analyzed. 
First, the number of stoppages for each 
worker for each 105-minute period of ob- 
servation was counted; these figures were 
totaled and divided by the total number of 
observations, to give a mean number of stop- 
pages for each worker per 105-minute period 
for each experimental condition. Second, for 
each individual, the median time worked be- 
tween stops was computed for each experi- 
mental condition. The median was used as 
an index in this case since the distributions 
for time were badly skewed and a mean 
would not have been a representative index 
of central tendency for these distributions. 
Differences for each index were tested for 
significance by Friedman’s nonparametric chi- 
square test (3). To summarize the direction 
of these differences, arithmetic means of the 
individual mean numbers of stops, and of the 
individual median minutes worked were com- 
puted. The statistical treatment was based, 
of course, on the indexes for each individual, 
rather than upon these mean scores. 

The results followed predictions exactly. 
The most frequent stops occurred for the 
large lots, the least frequent for the small 
lots, with the results for the medium-sized 
lots intermediate. Similarly, the average 
time worked between stops was greatest for 
the small lots, shortest for the large, with 


Table 1 


Sequence of Conditions 








Pieces 


Adaptation Period 


Observations 





L:* 3100 


Four hours of adaptation to observation. 


2 afternoons 


Several years’ adaptation to size 


S,** 
Mit 
Mot 620 
S.** 310 
L.* 3100 


310 
620 


Omitted 


Six days’ adaptation to size 
Six days’ adaptation to size 
Consecutive to M; 


One day’s re-adaptation 


3 afternoons 
3 afternoons 
3 afternoons 
0 afternoons 
3 afternoons 





* “Large” lots. 
** “Small” lots. 
t “Medium” lots. 
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Table 2 


Effects of Variations in Lot Size upon Work Stoppages 








Minutes 
Worked 
Between 
Stops Per 
105-Min. 
Period 
(Mean of 
Medians) 


Mean 
Number of 
Stops Per 
105-Min. 


Size of Lot Period 





Large (Li + Le) 
Medium (M,; + M2) 9.01 
Small (S) 7.94 
p .01* 


11.06 6.57 
9.51 
11.41 


o1** 





* Distribution of individual mean number of stops for each 
experimental period tested by Friedman's ranked chi-square 
test (3). 

** Distribution of individual median times for each experi- 
mental period tested by Friedman's ranked chi-square test (3). 


that for medium lots between. The differ- 
ence in each case was ‘significant at less than 
the 1% level. 

On the basis of the evidence, the hypothe- 
sis remains tenable that there are mild posi- 
tive relationships between the worker and 
light repetitive tasks, these relationships oc- 
curring without changes in external incentives 
such as piece-rate payments. This independ- 
ence is substantiated by the constancy of the 
production figures on these operations where 
obvious restriction was practiced. No change 
in production occurred for any individual un- 
der any of the conditions, as indicated by 
production records. 

The results are probably not due to sug- 
gestion or to desire to please the experiment- 
ers. All of the workers were interviewed 
after the experiment. Although they under- 
stood that the experiment was intended to 
help them, and presumably desired to co- 
operate, they did not have any idea of the 
results which were expected, nor even that 
frequency of stoppages was the primary be- 
havioral variable being observed. Further 
evidence against effects of suggestion or de- 
sire to please is furnished by the strong and 
openly expressed dislikes for the small lot 
sizes. Only one operator preferred the small 
lots; most preferred the customary lots or lots 
intermediate between the customary and the 
experimental “medium-sized” lots. All work- 
ers reported that they had maintained con- 
stant output throughout the experiment. 
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There are several possible reasons for this 
lack of preference for the small lots, in which 
behavioral changes indicated that traction ef- 
fects were at their maximum. One operator 
gave a plausible explanation of the dislike of 
the small lots on the basis of a history of as- 
sociation of difficult special lots and repair 
work with small-lot sizes. The resistance to 
change by the foreman and workers was 
also important. The dislike of planning the 
counting of the work on the part of the fore- 
man, who was very popular with the work- 
ers, is probably even more important. The 
further possibility remains, however, that the 
pull to complete a batch of work is pleasant 
only when it is relatively mild, and that it 
may become irritating when it is compara- 
tively strong. This interpretation is being 
checked by a further experiment. 


Conclusions 


1. Changes in size of lot were reflected in 
objectively observable changes in number and 
spacing of voluntary work stoppages. 

2. The results of this experiment agree 
with the predictions made from the hypothe- 
sis of batch traction. 


Received September 27, 1954. 
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Previous studies have indicated that pro- 
duction and morale of industrial workers are 
influenced by participation in group discus- 
sions. At Harwood, Bavelas found that per- 
mitting the group to set its own production 
goal led to greater increase in production than 
occurred in the control group which merely 
discussed effectiveness of teamwork (2). Two 
factors have hindered interpretation of these 
results: detailed information concerning con- 
duct of the experiment was not presented, and 
there were limitations in the design of the 
experiment. Groups had not reached the 
asymptote of the learning curve and there 
appear to have been differences in training 
given the control and experimental groups. 
It has been difficult to determine whether in- 
creased output as a result of group goal-set- 
ting was an artifact of the situation or a 
manifestation of a general principle demon- 
strable upon repetition. The purpose of the 
present experiment was, in essence, to repeat 
the former study under more controlled con- 
ditions, while securing complete records so 
that any factors not under experimental con- 
trol could be later analyzed. 


Procedure 


This experiment was conducted at a midwestern 
garment manufacturing company employing approxi- 
mately 1,000 workers in office and factory. The 
procedure included a control period (pre-experimen- 
tal) and an experimental period for all subjects. 
(The original plan had included a postexperimental 
period, but business conditions rendered it impos- 
sible to continue this part of the program.) Two 
types of groups participated—those that engaged in 
group discussion only and those that also set their 
own group-production goals. Since individual pro- 
duction records were kept during the five weeks pre- 
ceding the experimental period, each group formed 
its own control. A questionnaire including multiple- 
choice and free-response types of questions was ad- 
ministered at the beginning of the experimental pe- 
riod and again at the end. Until the day of the 
first discussion session, Ss were unaware that the 
study was to be conducted or that they had been 
selected to participate. During the experimental pe- 
riod, Ss met each week in groups for discussion. 
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The first session was one hour in length; the re- 
maining four continued for a half-hour each for all 
groups. All sessions were tape recorded and later 
transcribed. Precautions were taken to reassure em- 
ployees about the presence of the tape recorder. Its 
operation was briefly described and they were per- 
mitted to test their own voices. The personnel di- 
rector acted as group leader throughout the five- 
week period and adhered to a list of topics prepared 
by E. Thus, the subject content for all groups was 
approximately the same. This included areas relat- 
ing to employee problems, interpersonal relations, 
company restrictions and policy, employee benefits, 
and community relations. 

The Ss were requested to cooperate with the “trial 
study” the company was conducting. They be- 
lieved their group had been chosen for convenience 
(as, indeed, was true) rather than for any special 
merit. 

Five groups of fully trained Ss participated: 


Group A: order checkers, discussion group (N= 5), 
salaried ; 

Group B: mail openers, goal group (N =5), sala- 
ried ; 

Group C: swatchers, discussion group (N = 6), piece 
rates; 

Group D: swatchers, goal group (JV = 6), piece rates. 


(These groups met with their supervisor and depart- 
ment managers. Groups A and B had the same de- 
partment manager but different supervisors; Groups 
C and D had the same department manager and su- 
pervisor.) 


Group E. shippers, salaried. (Originally this group 
had 15 members. It was intended to serve first as 
a discussion and then as a goal group in an effort to 
study effects of group size. However, so many 


“members of the group were lost due to unavoidable 


shifts in personnel that it had to be dropped from 
the analysis.) 


At the first meeting the leader outlined to Groups 
B and D the general method of goal setting, and en- 
couraged them to consider it. After they had agreed 
to adopt the method, Groups B and D discussed 
during each meeting the previous week’s record and 
set the goal for the following week. The predeter- 
mined list of topics was discussed at the same meet- 
ings. Members of these groups were encouraged to 
use their own judgment in setting goals, but were 
reminded that unless they set the goal a little above 
their present accomplishment they would be unable 
to determine the effectiveness of the group when 
working as a team. 
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Matching of groups was not perfect because of 
types of work performed, experience of employees, 
etc. Statistical analysis showed that paired groups 
did not differ appreciably or significantly in respect 
to dexterity (as measured by the Minnesota Rate 
of Manipulation Test—Turning) or intelligence (as 
measured by the Wonderlic Personnel Test). There 
were small but statistically insignificant differences 
in age and length of employment. The possible in- 
fluence of these factors was controlled by assigning 
groups to discussion and goal conditions so that for 
the pair composed of Groups A and B, differences in 
age and length of employment were in the direction 
opposite to that of the differences between Groups 
C and D. (As will be indicated later, none of these 
factors proved important in the results of the ex- 
periment.) 

Several methods were employed to encourage Ss’ 
participation in discussions. These included direct 
questions, case studies, modified role playing, ex- 
planation with opportunity to ask questions, and 
requests for suggestions from workers about topics 
for future meetings. 

Method of setting goals varied with the group. 
Group B worked in terms of mean units handled 
per person per hour (computed by dividing total 
weekly output by total hours worked in the depart- 
ment) and was told each day what its previous day’s 
average had been. Group D worked in accordance 
with a point system set in terms of total weekly 
output for the group—the standard was that from 
which piece-rate wages were computed, but the em- 
phasis was on points, not earnings. Points for this 
group were cumulative, and workers were informed 
each day how much closer to the goal they had 
progressed. Production records were prorated to 
compensate for time in discussion meetings and for 
the several absences that occurred. No formal state- 
ment of work accomplished was provided for the 
discussion groups. However, this information was 
readily available since these groups, as well as B 
and D, had always been well aware of progression 
of work through the department. 


Results 


Preliminary analysis of tape recordings 
showed no obvious differences in the content 
of discussions. The meetings were favorably 
received by employees as well as supervisors. 
A typical comment, occurring during the 
second session, was, “I seem to have got more 
work done this week but it seemed easier 
than ever before.” Statements of this type 
were not unusual. Goals set and production 
attained (Groups B and D) are shown in 
Table 1. 

Obviously the goals were set too high for 
ready attainment after the first few weeks, a 
result also obtained by Bavelas (2). 


Table 1 


Tabulation of Goals Set and Attained 








Group B Group D 





Set* Attainedt Sett Attainedt 





55 57.3 
55 56.6 
60 58.3 
60 55.1 
60 53.2 


286.2 
246.6 
260.4 
254.3 
242.9§ 





* Units per person per hour. 

+ These figures were prorated in instances where absence oc- 
curred. They now represent what production is assumed to 
have been if each employee had participated in a 40-hour week. 

t Total weekly units. 

§ Employees of Groups C and D were informed of a forth- 
coming lay-off during the course of this week. 

Some fluctuations in over-all plant output 
occurred. However, this does not account for 
differences between groups since experimental 
periods ran simultaneously for all groups. 

A better indication of success of the pro- 
gram is obtained by examining individual pro- 
duction increases. Relative changes in pro- 
duction for goal groups are shown in Figs. 1 


and 2. Each S’s performance was evaluated 
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Fics. 1 and 2. Individual production change: 
percentage increase or decrease during experimental 
period as compared with control period. 
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by dividing mean difference in production by 
mean production during control period and 
converting to percentages. Thus, no change 
would be indicated by 0%. 

Nonparametric tests of significance (Table 
2) were used since the samples were small 
and the assumption could not be made that 
they were randomly drawn from a normally 
distributed population. Compared with its 
control period, each goal group showed an in- 
crease in production which is significant at 
the .05 level. Although production of dis- 
cussion groups increased during the experi- 
mental period, these increases did not attain 
statistical significance. When Group B was 
compared with Group A, the former had a 
greater increase in production (significant at 
the .05 level). The same was true of Group 
D as compared with Group C. When goal 
groups were combined and compared with 
combined discussion groups, the increase in 
production for goal groups was significantly 
greater at the .01 level. 

In general, questionnaire results did not 
differ greatly for the various groups. The 
majority of workers indicated that time in 
discussion had been well spent (73%), that 


they preferred to meet with both their super- 
visor and department manager (69%), and 
that they had learned something new (69%). 
The most favored methods of conducting 
meetings were presentation of a problem 
situation for discussion, and explanation of 


Table 2 


Summary of Statistical Analyses; Comparison of 
Percentage Increases 








Experimental Production vs. Control Production 





Group p 





Wilcoxon Paired Method (3) 


>.05 
<.05 
>.05 
<.05 


A (discussion) 
B (goal) 
C (discussion) 
D (goal) 





Festinger Method (1) 
B vs. A 
D vs. C 
B and D vs. A and D 


<.05 
<.05 
<.01 





Lois C. Lawrence and Patricia C. Smith 


company policy. Among members of goal 
groups 76% thought the goals they had set 
were “about right,” 64% would like to con- 
tinue to meet in the same manner, while 21% 
would prefer in the future to meet for dis- 
cussion only. With one exception Ss thought 
that everyone had worked to meet the goal 
and 93% thought everyone had concurred in 
setting it. (All percentages include responses 
from Group E which included a goal con- 
dition.) 


Discussion of Results 


When compared to the control period, pro- 
duction increased for both discussion and 
goal groups. This increase was significantly 
greater for goal groups than for discussion 
groups. Analyzing individual records, it is 
found that individual production increases 
are not correlated with age, length of em- 
ployment, salary, dexterity, or intelligence. 
The significant variable is the difference in 
experimental procedure between the discus- 
sion and goal groups. Experimental groups 
were necessarily small to avoid inclusion of 
learners and nonproduction workers. Since 
the study was intended as a replication, con- 
trol was considered more important than size 
of sample. 

Discrepancies between goals set and goals 
attained by Group D may be attributed in 
part to factors which E could not control. 
During the second week of the study, swatch- 
ers were given a new job lot containing puck- 
ered fabrics (which stick together) and sheer 
nylon (for which the glue proved unsuitable). 
Output necessarily diminished and the man- 
ager recognized the changed conditions by 
promising a change in the standard for those 
lots. Workers were not discouraged when 
they failed to reach their goal, but lowered 
it to a more realistic level when told in ad- 
vance of the coming week’s work. However, 
both Groups C and D worked on the same 
assignment. D’s greater production increases, 
despite adverse conditions, contribute further 
evidence of the effectiveness of the group-de- 
cision method. 

Discrepancies between goals set and at- 
tained by Group B cannot be as easily ex- 
plained. To a certain extent, ability to 





Group Decision and Employee Participation 


reach the goal depended on daily volume of 
mail, which fluctuated somewhat. During 
the final weeks when production was lower 
than at first (but still considerably higher 
than during the control period), there were 
several days when output was very close to 
60. Tape recordings indicate that this group 
did not lower its goal because they felt the 
level was potentially attainable. 

It should be stressed that the group-discus- 
sion method involves an extensive learning 
process. All too often we forget that a per- 
son unaccustomed to verbal participation in 
such situations requires much training be- 
fore he realizes the permissiveness of the 
situation and contributes freely. Progressive 
increase in spontaneity of response is appar- 
ent throughout the course of the recordings. 
This does not imply, however, that subjects 
attained a level of verbal facility equivalent 
to that of the average college student. De- 
spite the relative reticence of workers, there 
were clear-cut changes in production which 
must have reflected changes in attitude. 

When considering application of this 
method, one should remember that these 


production increases were obtained during a 
short-term experimental period that was ne- 
cessitated by the seasonal nature of plant con- 
ditions. A long-term program would be pref- 
erable both to ensure continuing effectiveness 


and to allow a less hurried pace for the verbal 
learning process. 


Summary 


An experiment was performed to determine 
whether industrial employees setting their 
own group goals attained higher production 
output than employees participating in group 
discussion only. Two pairs of groups simul- 
taneously completed the experimental pro- 
gram. When mean experimental production 
was compared with mean control production, 
on an individual basis, it was found that those 
groups setting their own goals showed signifi- 
cantly greater increases. It is suggested that 
the group discussion method is a learning 
process and must be considered as such when 
plans are being made for application to spe- 
cific situations. 


Received October 20, 1954. 
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Proficiency as evaluated in typical junior 
administrative officer assignments depends 
primarily on various personality and attitude 
characteristics. Attempts to develop person- 
ality and attitude measures to predict admin- 
istrative performance, however, have pro- 
duced thus far essentially negative results. 
One of the major problems encountered has 
been the adequacy of available interim cri- 
teria. Ratings by associates have proved the 
most reliable and, in addition, have yielded 
relatively high correlations against more ulti- 
mate criterion ratings (3, 8, 9). Associate 
ratings for specific critical personal charac- 
teristics, however, at least in a training situa- 
tion such as the Air Force Officer Candidate 
School, are highly intercorrelated and appear 
to be based on “reputation” rather than on 
relevant observations (3, 7). These studies 
indicate that officer candidates do not have 
the information required to rate their asso- 
ciates adequately. The Officer Candidate 
School environment does not provide, and 
cannot reasonably be expected to provide, the 
candidate with the opportunity to display be- 
haviors critical to success as a junior adminis- 
trative officer. 

The present research has attempted to de- 
velop a procedure for evaluating candidates 
during their officer training with respect to 
specific personal characteristics determined to 
be critical in effective on-the-job performance. 
The experience of other investigators has in- 
dicated that the situational performance test 
is a promising approach to the measurement 


1 This research was one of a series of studies by 
the American Institute for Research, sponsored by 
the United States Air Force under Contract No. AF 
33(038)-10587, and monitored by the Personnel Re- 
search Laboratory, Air Force Personnel and Training 
Center, Lackland Air Force Base, Texas. The opin- 
ions expressed are those of the author, and do not 
necessarily reflect those of the Air Force. 

This study was a part of a doctoral dissertation 
(6). The author wishes to express appreciation to 
her advisor, Dr. John C. Flanagan. 


of these personal leadership qualities (1, 5). 
The situational test, by placing the individual 
to be evaluated in a simulated job situation, 
provides for replication in the test field of the 
essential aspects of the criterion or job field. 
The simulated job situation provides the can- 
didate with the opportunity to display behav- 
iors critical to job success, and provides at the 
same time for specifically directed observation 
and immediate recording of these behaviors. 
Assuming that “a given subject will respond 
to similar environmental situations in a simi- 
lar manner,”’* the situational test provides 
for an essential link with the typical job per- 
formance we are attempting to predict. 

A situational test is defined for the pur- 
poses of this study as a series of performance 
problems designed to simulate the job situa- 
tion of the Air Force junior administrative 
officer. The test is designed for use during 
the officer training program, to be adminis- 
tered by the Officer Candidate School staff. 
The problems are administered to groups of 
specified number under standardized condi- 
tions, using standardized instructions for ex- 
aminers and for participants. Individual per- 
formance on each problem is scored by means 
of a behavior check list specific to that prob- 
lem. The check list provides standard cues 
for examiners to use in scoring the perform- 
ance of the participants. 


Procedures 


In developing a situational test for evaluating offi- 
cer candidates, the first step was the identification of 
the critical personality and attitude aspects of suc- 
cessful administrative officer performance. Several 
thousand critical incidents concerning effective and 
ineffective officer performance have been collected 
in a study by Preston (4). Those incidents con- 
cerned with personal (nonintellectual) behaviors re- 
ported on junior administrative officers were selected 
and recategorized. Recategorization resulted in a list 


2 The assumption as quoted has been stated as the 
principle of “consistency” by the OSS Assessment 
Staff, Assessment of Men, New York: Rinehart, 
1948, p. 38. 


338 





Evaluating Potential Officer Effectiveness 


of 54 behavior categories. These behaviors were 
classified in five major areas of behavior: 


I. Organizing, planning, making and acting on 
decisions 
II. Working effectively with others 
Adjusting to unusual or unexpected situations 
IV. Accepting or assuming organizational respon- 
sibility 
V. Personal habits of work; motivation 


The next step was the development of a set of 
comprehensive rationales for the 54 critical behav- 
iors to aid in systematic development of the situa- 
tional problems (6). Certain types of problems are 
better adapted to the measurement of certain criti- 
cal behaviors, e.g., a group task, leader-designated 
type of situation is better adapted to the measure- 
ment of the behavior “delegates responsibility for 
specific tasks to subordinates’ than is a leaderless 
group-conference type of situation. A series of prob- 
lems was developed to cover the critical behaviors in 
each of the five major areas. For certain situations 
an “actor” is required in order to elicit the desired 
behavior(s) properly. 

The materials for each problem include an Ex- 
aminer’s Guide, a score sheet, and individual in- 
struction cards for participants and for actors where 
required by the problem. The Examiner’s Guide pre- 
sents the examiner with a description of the prob- 
lem, a list of supplies required, a diagram of the 
physical layout and of the seating arrangement for 
the participants, and instructions specific to the 
problem. The score sheet, developed specifically for 
the particular problem, contains a number of brief 
test statements of the critical behaviors they are in- 
tended to measure. Test statements may be in ef- 
fective or ineffective form, but both forms are not 
included for the same critical behavior. The occur- 
rence of the behavior as stated on the score sheet is 
indicated by the examiner simply by a mark through 
the letter corresponding to that of the participant 
exhibiting the given behavior. 

Individuai instruction cards are given to partici- 
pants and to actors. These are retained throughout 
that problem test period. In addition, the instruc- 
tions are read to them by the examiner. Actors are 
assigned to a specific problem and remain with that 
problem throughout the entire testing. The actor 
roles are partially structured in regard to action and 
verbalization in order to assure uniformity of test 
conditions while allowing the spontaneity and flexi- 
bility required in the situation. 

Several tryouts were conducted both at college Air 
Force Reserve Officer Training Course units and at 
the Officer Candidate School for the purpose of de- 
termining difficulties in administration procedures, in 
technical accuracy of the problems, and in clarity 
and completeness of the instructions. The final form 
of the test consists of sixteen problems: ten of 25- 
minute duration and six of 8-minute duration. This 
form was administered to 480 officer candidates dur- 
ing their last two weeks of training. 
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One hundred and twenty candidates were tested 
each day for a four-day test period. Each of the 
120 candidates participated in all sixteen problems 
during the one six-hour test day. The candidates 
participated in groups of five. Each within the 
group was assigned a letter A, B, C, D, or E, as well 
as the number of his group, for the entire test day. 
Candidates were evaluated in their assigned groups 
on the ten 25-minute problems; and individually, 
with an actor, on the six 8-minute problems. Evalua- 
tion in terms of role played and in terms of scorable 
behavior items was identical for all participants with 
the letter A. These participants constituted Partici- 
pant Group A. Similarly, intragroup evaluations 
were identical for Participant Groups B, C, D, and 
E. Intergroup evaluations, while not identical be- 
cause of differences in assigned roles for Groups A, 
B, C, D, and E, were equated insofar as possible for 
the total number of scorable items in each of the 
five major areas of behavior. The actual time dur- 
ing which an individual participant was in a posi- 
tion to be evaluated was 175, 159, 159, 164, and 164 
minutes for participants assigned the letters A, B, C, 
D, and E, respectively. 

The problems were administered by tactical offi- 
cers with student officers® as actors and as addi- 
tional observers. Examiner-observer (-actor) teams 
were assignec to a specific problem and remained 
with that problem throughout the entire four-day 
testing. Both examiner and observer scored each 
participant on a score sheet specific for the prob- 
lem, with instructions that their scoring be inde- 
pendent. Prior to the four-day testing, examiners, 
observers, and actors participated in a three-hour 
training session. During this session the administra- 
tion and scoring of several of the situational prob- 
lems was demonstrated and discussed. Following 
the general training session, actors were trained in 
their roles by the examiner assigned to the problem. 


Results and Discussion 


The number of behavior items scored for 
one problem ranged from 12 to 34, with a 
median of 20.5. There were 332 items in the 
total test: 85, 137, 18, 36, and 56 in Areas 
I-V, respectively. An “effective” score was 
obtained by giving one point for each effec- 
tive item marked on the score sheet, an “‘in- 
effective” score by giving one point for each 
ineffective item marked. Total score was ob- 
tained by subtracting the ineffective score 
from the effective score and adding a constant 
to avoid negative scores. Items had been 
classified in the five major areas of behavior 
so that five area scores were obtained in ad- 
dition to total test scores. 


3 Student officers were those in training at the 
Officer Basic Military Course of the Officer Candi- 
date School, Lackland Air Force Base, Texas. 





340 


Of the 480 candidates tested, 343 com- 
pleted all sixteen problems and constituted 
the analysis sample. Analyses were com- 
pleted within participant groups because of 
the possible differences among the groups in 
ease and opportunity of scoring on test items. 
Correlation coefficients obtained within the 
five Participant Groups A-E were then aver- 
aged, using the Fisher r-to-z transformation 
to obtain an over-all coefficient for the com- 
bined groups. 


Interobserver Reliability 


The reliability of the scoring method was 
estimated by correlating the scores given the 
participants by the examiner with those given 
by the observer. The correlation coefficients 
obtained were .75 for total test scores, and 
.73, .68, .59, .61, and .63 for Areas I-V, re- 
spectively. 


Test Reliability 


The reliability of the test was estimated, 
using the Kuder-Richardson formula 20. This 
formula does not require that comparable 
halves be obtained.* It does, however, as- 
sume that all the items are measuring the 
same behavior factor. If several behavior 
factors are involved, as might reasonably be 
expected in this test, the obtained value will 
tend to be lower than the comparable forms 
reliability. If, on the other hand, there is a 
tendency for performance of one behavior to 
influence the observation and scoring of a 
subsequent behavior in the same problem, a 
spurious source of agreement is introduced. 
Attempts were directed in the development of 
the problems and in the training of the ex- 
aminers to reduce such carry-over or “halo” 
effect. It is impossible to estimate the extent 
to which there was some degree of carry-over 
within the problems. As each problem was 


4 Split-half correlation coefficients were obtained 
for area and total test scores by correlating examiner 
scores on one set of eight problems with those on 
the other set of eight problems. It was not possible, 
however, to obtain well-matched halves in terms of 
number of scorable items for each problem group. 
The obtained values, corrected by the Spearman- 
Brown formula, were .37, .16, .22, 06, and .21 for 
Areas I-V, respectively, and .43 for the total test. 
The values for Area I and for total test are signifi- 
cant at the .01 level, those for Areas III and V, at 
the .05 level. 
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Table 1 


Intercorrelations Among Area and Total Scores 
(N = 343) 








Area 





II Ill IV V 





.256** .047 
.106* 


.166** .409** 

S74 233" 

.144** .082 
‘z7° 





* Significant at .05 level. 
** Significant at .01 level. 


observed and scored by a different examining 
team with no knowledge of the other perform- 
ances or scores of the participants observed, 
there can be no carry-over effect from one 
problem to another. The obtained Kuder- 
Richardson values were as follows: for Areas 
I-V, respectively, .57, .51, .16, .33, and .32; 
for total test, .68. All values except that of 
.16 for Area III are significant at the .01 
level.° 


Intercorrelations Among Area Scores 


Table 1 presents the intercorrelations among 
area and total scores. This table, particularly 
when considered in conjunction with the ob- 
tained area reliabilities, suggests a close rela- 
tionship between the behaviors of Area I and 
those of Area V. 


Correlations with Officer Candidate School 
Evaluation Measures 


Test scores were correlated with various 
OCS measures available for the participants. 
These measures included flightmate and tac- 
tical officer paired-comparison ratings on over- 
all officer ability, final military class standing, 
and final academic grade.* The first three 
measures are highly intercorrelated, as indi- 
cated by the coefficients obtained for the 343 


5In the absence of a satisfactory formula for the 
standard error of the Kuder-Richardson coefficients, 
the levels for the corrected split-half coefficients were 
used as a probable conservative estimate in indicat- 
ing levels of significance. 

® The correlations between test scores and final 
academic grade are based on 310 participants, and 
were computed by the staff of the Personnel Re- 
search Laboratory, Human Resources Research Cen- 
ter, Lackland Air Force Base. 
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participants (Table 2). An average of the 
first two measures contributes one-half to the 
third measure: final military class standing. 
The other half is composed of several similar 
paired-comparison ratings made earlier in the 
training period. A previous study (2) has 
indicated that these measures are based pri- 
marily on “reputation,” i.e., a general impres- 
sion regarding officer ability which influences 
any and all ratings. These measures are, 
nevertheless, the best single predictor of later 
success at present available for officer candi- 
dates. Table 2 presents the correlation coeffi- 
cients obtained for total test scores with four 
OCS measures. 

None of these measures is regarded as a 
criterion measure, but rather as providing 
some indication of what the test is measur- 
ing. Military and academic grades, as well 
as associate ratings, are predictive of later 
success, though at a somewhat lower level. 
It is encouraging that the correlations ob- 
tained in this study show significant relation- 
ships between the measure developed and 
those grades and ratings used at present by 
the Officer Candidate School in evaluating 
candidates. Because the correlations are not 
high, they indicate that the situational prob- 
lems are measuring something other than aca- 
demic and reputation factors, and may make 
a unique contribution to the prediction of offi- 
cer success. This seems a reasonable assump- 
tion on the basis of the method used in de- 
veloping the test. Follow-up data for this 
class are required for adequate validation of 
the situational test, and thus for any evalua- 
tion beyond preliminary speculation, and are 
not available at this time. 


Table 2 


Correlations of Situational Test Scores with 
OCS Evaluation Measures 








OCS Measures* si 





1. Flightmate paired comparisons 21 
2. Tactical officer paired comparisons 24 
3. Final military class standing 25 
310 4. Final academic grade 25 





* Intercorrelations among measures 1, 2, and 3:72 = 83, 
ria = .89, ro = .91. 
** All coefficients significant at .01 level. 


Summary 


A situational performance test was devel- 
oped for evaluating potential officer effective- 
ness during the officer training program. The 
personal (nonintellectual) behavior charac- 
teristics critical to effective Air Force junior 
administrative officer performance were identi- 
fied. Situational problems simulating an- 
ticipated job situations were systematically 
designed to elicit these critical behaviors under 
standardized test conditions. Behaviors were 
observed by trained examiners and recorded 
by means of a behavior check list which pro- 
vided for scoring as the behavior occurred 
simply by a mark in the appropriate place. 

The test was administered by the Air Force 
Officer Candidate School staff to an entire 
graduating class of 480 candidates. Each 
problem was administered by an examiner- 
observer team assigned to the problem for the 
entire testing period. The following results 
were obtained: 

1. Reliability of scoring for the total test 
was estimated as .75 by correlating examiner 
scores and observer scores. 

2. Test reliability was estimated by the 
Kuder-Richardson formula 20. A coefficient 
of .68 was obtained. 

3. The correlation coefficients obtained be- 
tween total test scores and Officer Candidate 
School evaluation measures were: with flight- 
mate paired-comparison ratings, .21; with 
tactical officer paired comparison ratings, .24; 
with final military class standing, .24; and 
with final academic grade, .25. 

It appears that the situational performance 
test approach to the problem of evaluating 
personality and attitude characteristics re- 
quired for successful officer performance is 
promising. The situational test has potential 
value in two areas: (a) in providing an in- 
terim criterion measure for the development 
of more economical selection and evaluation 
devices, and (5) in supplementing present 
training procedures aimed at developing these 
personal characteristics. An important con- 
sideration in this type of test in a training 
situation such as the Air Force Officer Candi- 
date School is in providing the opportunity 
for the display of behaviors essential to job 
success—an opportunity that the candidate 
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does not have in the training situation. The 
situational test in simulating the anticipated 
job situation has at least logical validity. As- 
suming an individual will respond to similar 
environmental situations in a similar manner, 
the situational test provides for an essential 
link with the typical job performance which 
we are attempting to predict. 


Received August 11, 1954. 
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The ability to trouble shoot electrical 
equipment adequately has become increas- 
ingly important as the Navy has come to 
rely more and more on radio, radar, and other 
electronic equipment. Interviews (2) with 
knowledgeable naval personnel have indicated 
that, at least for the aviation electrician, the 
ability to trouble shoot is the ability that 
distinguishes between degrees of proficiency. 
This paper presents a study directed toward 
building a job sample trouble-shooting per- 
formance battery (AE Trouble-Shooting Per- 
formance Examination) and a written test 
which predicts the job sample performance 
examination. Although the approach to the 
trouble-shooting problem was limited to avia- 
tion electricians, it is believed that enough 
sympathetic resonance exists between trouble 
shooting done specifically by aviation elec- 
tricians and that generally performed else- 
where to make the approach, methodology, 
and outcome of interest to everyone con- 
cerned with measuring trouble-shooting ability. 

Moreover, Fattu, Mech, and Kapos have 
recently emphasized the need for “. . . in- 
formation with reference to describing the be- 
haviors or response categories that are impor- 
tant during the actual problem-solving proc- 
ess” (1, p. 1). The present study presents 
information relevant to the behaviors in- 
volved in a problem-solving (trouble-shoot- 
ing) situation. 

A logical analysis of the trouble-shooting 
process indicated that in order to trouble 
shoot electrical equipment systematically, the 
aviation electrician must perform four work 
steps: 


1 The research here reported is based on a portion 
of the data gathered under Contract N8onr-—69402 
between the Office of Naval Research and the Insti- 


tute for Research in Human Relations. The opin- 
ions expressed are not to be construed as reflecting 
the opinions of the Office of Naval Research or of 
the naval service. 

2The following persons made a substantial con- 
tribution to the design, organization, or carrying out 
of this research: Dr. D. Mayo, Dr. W. Schaefer, Dr. 
D. Smith, Mr. J. Nagay, Lt. Comdr. H. Pitcher, 
Lt. Comdr. G. Herndon, Dr. F. K. Berrien, Dr. D. 
Courtney, Dr. E. Danzig, Mr. J. H. Hill, Mrs. M. 
Schultz, Dr. D. Schultz, and Dr. W. Angoff. 


1. formulate hypotheses as to the cause of 
the malfunction, 

2. perform actual electrical checks and 
equipment performance checks in order to 
substantiate or reject the hypotheses of 1 
above, 

3. diagnose (ascertain) the cause of the 
malfunction from a synthesis of the informa- 
tion obtained in the electrical and equipment 
performance checks with the precheck hy- 
potheses. 

4. perform the actual work required for 
elimination of the cause of the discrepancy. 

Four separate performance tests were con- 
structed. Each test was directed toward 
measuring the aviation electrician’s ability in 
one of the four logical areas outlined above. 
The four comprised the battery which has 
been called the AE Trouble-Shooting Per- ° 
formance Examination. Because of limita- 
tions of time, personnel, and material, in 
many situations the administration of a de- 
tailed performance examination is not fea- 
sible. Therefore, a further purpose of the 
study was to develop a short paper-and-pencil 
examination which would predict the per- 
formance examination and thus would enable 
responsible personnel to predict by inference 
the trouble-shooting ability of aviation elec- 
tricians. 


Method 


Performance Subtest 1—Test of Ability to Formulate 
Hypotheses 


In order to formulate correct and logical hypothe- 
ses as to the cause: of a malfunction, the aviation 
electrician must have a knowledge of the individual 
components of the system in question, the purpose 
and operation of the system as a whole, and any in- 
teractions between the system in question and other 
systems. In order to develop a test of this ability, 
entries made over a three-month period in five elec- 
trical shop maintenance logs were analyzed in order 
to determine the relative frequency of occurrence of 
various discrepancies. As a result of this analysis, 
39 typical discrepancies became apparent. These dis- 
crepancies were submitted to three Chief Aviation 
Electricians who separately rated each of the dis- 
crepancies in terms of pay grade? and rating struc- 


3 Men within a naval rating (aviation electrician) 
are classified according to their rate (striker, third 
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ture applicability. On the basis of these ratings and 
further discussion, 24 discrepancies were chosen for 
this test. Six of these discrepancies were at a diffi- 
culty level appropriate for “strikers,” six were at a 
difficulty level appropriate for Aviation Electricians, 
third class; six for Aviation Electricians, second 
class; and six for Aviation Electricians, first class. 
Moreover, the test included discrepancies which were 
representative of at least 10 aircraft types. A sam- 
ple item, wiring diagram omitted, is presented below: 


Discrepancy 3. Left starter will not energize. 
(Starter motor was bench checked and operation 
found to be normal.) 

Which one(s) of the following could cause 
this discrepancy ? 

1. Open circuit on lead coming from power 
circuit to relay contacts. 

2. Wire from circuit breaker is off of switch. 

3. High resistance at solenoid terminal. 

4. Etc. 


Subtests Ila and IIlb—Ability to Perform Electrical 
Checks and Understanding Electrical Circuit Op- 
eration 

In order to test the aviation electrician’s ability tc 
perform the electrical checks necessary to substanti- 
ate or reject hypotheses on the cause of discrep- 
ancies, the Electrical Checks Testing Box was con- 
structed. In essence, this box was a simulator of 
the type of live circuits found in operational air- 
craft. 

In order to determine which of the many testing 
instruments used by the aviation electrician was 
most acceptable for the purposes of this subtest, a 
list of testing instruments used by the aviation elec- 
trician was compiled by two chief petty officers. 
The chiefs used personal experience, consultation 
with other chiefs, and service change manuals as the 
basis for their list. The list of instruments was then 
checked for completeness by a third chief. In all, 
the final list contained 27 instruments. Seven chiefs 
were then separately asked to select from the list the 
five testing instruments required for performing the 
duties of the aviation electrician, which best met the 
criteria of: 


1. relative frequency of use of the instrument dur- 
ing trouble shooting, 

2. criticalness of the instrument for performing the 
duties of the aviation electrician, 

3. actual usage of the instrument at the squadron 
level. 

The three instruments which were most frequently 
chosen as one of the five most important were the 
multimeter, the varidrive test panel, and the red star 
field-test set. A further intensive discussion was 
held on the merits of these three testing instruments 
in the light of the criteria. This resulted in the 
elimination of the varidrive and the red star field- 
test set. Multimeters, therefore, became the sole 
testing instrument employed. The task of the ex- 
aminee was to perform a series of electrical checks 
class, second class, first class, and chief). Starting 
with “striker,” each succeeding rate receives more 
pay and is presumed to be at a higher skill level. 
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on the Electrical Checks Testing Box, using the 
multimeter as a test instrument. Following each 
electrical check made by the examinee, a question 
was asked which probed into his understanding of 
operation of electrical circuits. A sample item fol- 
lows: 


Item 6. (a) Make a voltage check between 
D. and I. Reading in Volts ; 

(b) If resistor No. 7 were replaced 
by a resistor of 20 more ohms and resistor No. 
8 were replaced by a resistor of 20 less ohms, 
you would expect your voltmeter reading to be 
(the same) (greater) or (less). 


The total test contained 10 ohmeter, 10 voltmeter, 
and 10 ammeter checks (“a” items), as well as 30 
related problems pertaining to the aviation elec- 
trician’s understanding of the operation of electrical 
circuits (“b” items). The questions were so devised 
that all difficulty levels were represented and were 
devised so that the examinee was not required to 
know the value of the resistors in order to solve the 
problems. The questions on electrical checks (‘“a” 
items) were scored separately from those questions 
on understanding of electrical circuit operation (“b” 
items). Scores on the “a” items are henceforth re- 
ferred to as scores on Subtest IIa and scores on the 
“b” items are henceforth referred to as scores on 
Subtest IIb. 


Subtest III—Ability to Diagnose 


In order to determine the aviation electrician’s 
ability to synthesize the results of his electrical 
checks with his precheck hypotheses and mentally 
derive the single most probable cause of an electrical 
malfunction, Subtest III was constructed. Subtest 
III included 24 problems. In each problem the 
examinee was given a discrepancy (with an accom- 
panying wiring diagram) plus the results of elec- 
trical checks performed in order to locate the dis- 
crepancy. The task of the examinee was to deter- 
mine the one cause of the trouble. A typical item 
(wiring diagram omitted) follows: 


Discrepancy 1 (Wiring Diagram 1). The left 
engine cylinder temperature indicator reads 20° 
lower than the right indicator. A calibration 
check of left indicator with a Wheelco cylinder 
temperature tester proves it to be properly cali- 
brated. 


Which one of the following is the cause of this 

discrepancy ? 

1. Low resistance between plus and minus of 
indicator leads. 

2. A short in the right variable resistor. 

3. The left variable resistor adjustment has 
slipped to low side. 

4. Etc. 

Specific discrepancies included in Subtest III were 
selected according to the following criteria: 

1. The subtest should include items involving all 
the major electrical systems with which the aviation 
electrician is concerned in his maintenance work (i.e., 
DC power supply and control, AC power supply and 
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control, interior and exterior lighting, power plant, 
instruments, and special materials and equipment). 
2. The subtest should include items pertaining to 
as many aircraft and aircraft types as possible. 
3. The subtest should include items at all difficulty 
levels. 


Subtest IV—Ability to Eliminate the Cause of a Dis- 
crepancy (Basic Skills Test) 


Once an aviation electrician has determined the 
cause of a discrepancy he must perform the work 
necessary to complete the repair. Subtest IV com- 
prised a test of the aviation electrician’s ability to 
perform the actual work necessary to eliminate the 
cause of a discrepancy. In order to test this ability 
the Basic Skills Testing Box was designed and con- 
structed. This box simulated, in miniature, an air- 
craft section and contained a simulated motor, con- 
trol relay, control cable, ribs, fuel line, junction box, 
and lightening holes. The task of the examinee was 
to solder a series of five wires to an eight-pin can- 
non plug and then run the wires in accordance with 
an accompanying schematic diagram to the junction 
box, motor, and control relay. The abilities involved 
in completing this task are the same as those in- 
volved in the work of a variety of operational situa- 
tions. 

After a pretest, Subtests I, Ila, IIb, and III were 
administered to 162 fleet aviation electricians. Sub- 
test IV, after its pretest, was administered to a ran- 
dom sample of 137 of these 162 aviation electricians. 


Development 
amination 


of a Written Trouble-Shooting Ex- 


In order to develop the items necessary for the 
written examination, 665 multiple-choice type items 
were obtained. These items were administered to a 
well-planned sample of 399 aviation electricians. 
From the results of this test administration, an index 
of the difficulty level of each item was obtained. 
Each incorrect response (distracter) was also ana- 
lyzed in terms of the number of examinees attracted 
to it. Item-test correlations were obtained for each 
item. Items with difficulty levels greater than .95 
or less than .05 were eliminated from the pool. Dis- 
tracters which attracted too many or too few re- 
sponses were rewritten in some cases, and in other 
cases, the total items containing these distracters 
were eliminated. Items with low or negative item- 
test correlations were discarded. This procedure re- 
sulted in the retention of 359 of the original items. 
These 359 items were administered to the group of 
137 aviation electricians previously described. None 
of these 137 examinees was a member of the group 
of 399 who originally took the items. From this 
sample of 137 aviation electricians, 100 were ran- 
domly selected and item-composite performance test 
score biserial correlations computed for each of the 
359 items. Difficulty levels were also obtained. All 
items were then put into a frequency distribution 
and the 106 most valid items selected. The item- 
composite performance test biserial correlations of 
these 106 items ranged from .25 to .55; the median 
was 33. 


Results and Discussion 
Hypothesis Formulation Test—Subtest 1 


The means,‘ standard deviations, split-half 
reliabilities, and Spearman-Brown corrections 
for the sample on the performance subtests, 
as well as the analogous data for the com- 
posite battery are presented in Table 1. Since 
only 137 subjects took Subtest IV, the V for 
the composite test in Table 1 is 137. The 
composite performance score of an individual 
was obtained by simple addition of each of 
his scores on the individual subtests. The 
composite reliability for all men was esti- 
mated to be .86. Composite reliabilities were 
calculated according to the formula: 
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variance of subtest 7 

correlation between subtests 7 and 7 
standard deviation of subtest i 
standard deviation of subtest 7. 
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Assuming a man’s naval rate as an aviation 
electrician to be a valid criterion, the validity 
of the individual subtests in the performance 
battery as well as the validity of the com- 
posite battery can be partially determined by 
tests of the significances of the differences be- 
tween the rate means on the various subtests 
as well as between the means of the rates for 
the composite battery. The differences which 
were statistically significant below the .05 
level of probability were: 


Subtest 1—Hypothesis Formulation 


1. Between Aviation Electricians, first class 
and strikers. 

2. Between Aviation Electricians, first class 
and third class. 

3. Between Aviation Electricians, 
class and strikers. 

4. Between Aviation Electricians, 
class and third class. 

5. Between Aviation 
class and strikers. 


second 
second 


Electricians, third 


4 These scores, as are all other performance test 
scores here reported, are standard scores adjusted to 
yield a mean of 50 and a standard deviation of 10. 
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Table 1 
Performance Means, Standard Deviations, and Reliabilities 








S 


‘ pearm 
Split-Half Brown 
SD Reliability Correction 


Mean 


an- Spearman- 


Split-Half Brown 


Mean SD Reliability Correction 





Subtest I—Hypothesis Formulation 


Subtest IIa—Electrical Checks 





54.71 
54.26 
47.82 
43.85 


8.60 85 
8.55 73 
9.28 74 
9.12 719 

82 


92 49.63 
85 53.23 
85 50.22 
88 46.98 
90 .73 


9.68 72 
7.32 54 
9.00 .67 
12.93 85 


Subtest IIb—Understanding of 


Circuit Operation 


Subtest III—Diagnosis 





52.74 
52.54 
49.62 
46.48 


9.47 
9.95 
9.90 
9.99 


Subtest [V—Basic Skills 


10.21 46 
8.28 37 
9.11 38 

10.46 56 

47 


Composite 





AE 1 31 
AE 2 33 
AE 3 38 
Striker 35 
Total 


7.22 58 
8.60 31 
9.21 55 
12.07 65 
137 56 


260.58 
262.39 


26.54 82 
26.48 .80 
246.55 26.90 82 
224.71 28.92 .86 
137 .86 





Subtest Ila—Electrical Checks 


1. Between Aviation Electricians, second 
class and strikers. 


Subtest I11b—Understanding of Electrical Cir- 
cuit Principles 


1. Between Aviation Electricians, first class 
and strikers. 


2. Between Aviation Electricians, second 
class and strikers. 


Subtest I11J]—Diagnosis 


1. Between Aviation Electricians, first class 
and strikers. 

2. Between Aviation Electricians, second 
class and strikers. 

3. Between Aviation 
class and strikers. 


Electricians, third 


Subtest IV—Basic Skills Test 


1. Between Aviation Electricians, first class 
and strikers. 

2. Between Aviation Electricians, second 
class and strikers. 


3. Between Aviation Electricians, 
class and strikers. 


third 


Composite Scores 


1. Between Aviation Electricians, first class 
and third class. 

2. Between Aviation Electricians, first class 
and strikers. 

3. Between Aviation Electricians, second 
class and third class. 

4. Between Aviation Electricians, second 
class and strikers. 

5. Between Aviation Electricians, 
class and strikers. 


third 


Moreover, all individual subtests showed 
progressively increasing mean scores as rate 
increased from striker through second class. 
A general leveling of this stepwise increase 
was seen at the second-class level. No dif- 
ference between the scores of the first-class 
and second-class Aviation Electricians was 
statistically significant. Part of the reason 
for the fact that the Aviation Electricians, 
first class, showed no mean performance scores 
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that were significantly superior to the com- 
parable scores of the Aviation Electricians, 
second class, may be that the men in the first- 
class rating are occupied at administrative 
functions in their daily work. We deliberately 
purified our performance tests of this type of 
material. All of these data are interpreted as 
supporting a contention in favor of the va- 
lidity of the AE Trouble-Shooting Perform- 
ance Battery. 

Moreover, these data are useful in help- 
ing to gain insight into the trouble-shoot- 
ing (problem-solving) behaviors of the men 
tested. For instance, it seems that on 
hypothesis formulation, additional training 
might raise the level of ability of the men in 
the lower rates. In any case, the men in the 
higher rates do seem superior to those in the 
lower rates in this ability. Further research 
may be required in order to determine those 
factors other than experience which are in- 
volved in the ability to formulate hypotheses. 
Once these factors are known, people can be 
trained in them. Moreover, for each of the 
remaining performance subtests, fewer sta- 
tistically significant differences were seen, and 
the only significant differences were those be- 
tween the men in the higher rates and the 
strikers. It is also of interest to note that 
there were few statistically significant differ- 
ences between Aviation Electricians, second 
class and third class, as well as between Avia- 
tion Electricians, first class and second class. 
Hence, the general indications seem to be 
that small differences in rate do not reflect 
differences in actual trouble-shooting ability, 
while large differences may; and that the 
ability to perform in each of the areas meas- 
ured by our performance subtests may dif- 
ferentiate between good and poor trouble 
shooters. Thus, the aviation electrician’s 
trouble-shooting efficiency may be strength- 
ened by training the men in the higher rates 
in the abilities measured by Subtests Ila, IIb, 
and III and IV, and by training the men in 
the lower rates by the abilities measured in 
Subtest I. 

Moreover, aside from the comparatively 
high relationship between Subtests I and III, 
the performance subtests were relatively 
unique (Table 2), and by implication offer 
support for a contention in support of the 


Table 2 


Intercorrelations Among Performance Subtests 
(N = 137) 








Subtest IIb III 


I 

Ila .16 

IIb 32 

ll 50 .23 34 

IV .16 .06 1 17 





adequacy of our logical analysis of the trou- 
ble-shooting process. 

The correlation of the written test with the 
composite scores on the AE Trouble-Shooting 
Performance Examination for 37 subjects was 
found to be .70 and the Kuder-Richardson 
reliability was .83. However, the question 
of what part of the variance of actual trouble 
shooting that this test predicts remains open. 


Summary 


The preparation of a job sample trouble- 
shooting performance examination was de- 
scribed. The examination consists of four 
subtests, each of which is directed toward 
measuring one of the critical abilities judged 
to be needed in order to adequately trouble 
shoot on electrical apparatus. The tests were 
not highly intercorrelated, were judged to be 
moderately valid, and (except between first 
and second class) were found to discriminate 
moderately well between classes of naval 
aviation electricians. The data suggested cer- 
tain weak areas in the trouble-shooting (prob- 
lem-solving) abilities of the men in the vari- 
ous aviation electrician’s rates. The develop- 
ment of a short written test, which predicts 
this performance examination, was also de- 
scribed. 
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In the past few years a number of studies 
have been carried out in an effort to tease 
out of personality test responses those char- 
acteristics which make for success in the busi- 
ness world. Unfortunately, although the term 
“executive” has in general been accepted as a 
criterion of such success, different authors 
have not always agreed as to what occupa- 
tional groups should be included under this 
label. In addition, studies. in this area have 
been concentrated almost exclusively on two 
types of testing procedures: paper-and-pencil 
tests of various kinds (3, 5, 8) and the TAT 
(2, 4). 

In an effort to overcome difficulties of defi- 
nition, the executive sample selected for the 
present study was restricted to officers of 
large corporations: presidents, vice-presidents, 
secretaries, and treasurers, on the assumption 
that this group is, by virtue of its position, 
successful, and that any personality character- 
istics associated with business success should 
be most apparent at the top levels of the oc- 
cupational hierarchy. As in any study of a 
narrowly defined population, the problems of 
selecting an adequate control group differing 
from the experimenta! group only with regard 
to the variables under investigation presented 
difficulties. Our solution was to use two such 
groups, one consisting of college professors 
and the other of an occupationally heteroge- 
neous sample selected to approximate in age, 
education, and intelligence the characteristics 
of the executives. 

We have employed as our testing instru- 
ment the Tomkins-Horn Picture Arrangement 
Test,? a projective test which is particularly 
well suited to the study of personality vari- 
ables as they interact with and affect the in- 
dividual’s behavior in the work environment. 
The subject’s task is to select that arrange- 
ment of a set of three pictures which he feels 
makes the best sense and to write a short 

1 This investigation was supported in part by the 
Medical Research and Development Board, Office of 
the Surgeon General, Department of the Army, un- 


der Contract No. DA-49-007—-MD-476. 
2 Hereinafter referred to as the PAT. 


sentence for each picture to indicate the na- 
ture of his story. There is a total of 25 
arrangements to be made, the nature of the 
pictures varying from plate to plate. The 
content of the test is, however, heavily con- 
centrated in the work area with a secondary 
emphasis on social interaction. For the pur- 
poses of the present study, we have restricted 
our analysis to the verbal material which re- 
sembles that obtained with the TAT but dif- 
fers in that it may be directly related to 
highly structured stimulus characteristics and 
also, being relatively brief, may be analyzed 
in much more detail than is generally prac- 
tical with a TAT protocol. It should be made 
clear, however, that what the PAT verbal ma- 
terial gains in specificity it loses in breadth, 
and as a result the number of variables which 
may be treated is reduced below the number 
potentially involved in a free response tech- 
nique utilizing highly ambiguous stimuli. 


Procedure 


The data on which this study is based were ob- 
tained by mailing a copy of the PAT, a 20-item, 
multiple-choice vocabulary test developed by Thorn- 
dike (6), and a background questionnaire to a num- 
ber of executives and college professors with a request 
that they fill out the tests and return them in the 
enclosed envelopes. The executives were selected in 
part on the basis of recommendations made to the 
junior author by his acquaintances and in part from 
among the officers of businesses listed in Moody’s 
Manual of Investments. The professors’ names were 
obtained from university catalogs. The relatively 
high percentage of returns, 32.0% for the execttives 
and 45.7% for the professors, probably reflects the 
rather personal appeal such a study has for a num- 
ber of those contacted. 

Although the taking of the tests was not super- 
vised and, therefore, falsification is a possibility, 
especially on the intelligence test, it seems improb- 
able that it had any marked effect with the par- 
ticular samples studied, or, at any rate, any differ- 
ential effect. However, there may have been selective 
factors operating in determining which individuals 
would return their tests, and these factors might well 
have been a function of the personality character- 
istics of the subjects. If these selective factors were 
essentially the same in both groups, and it seems 
probable that they were, then our sampling pro- 
cedure acts in such a way as to reduce the prob- 
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ability of obtaining reliable group differences and 
makes the two samples spuriously similar. The re- 
sults obtained from the second control group serve as 
a partial check on this self-selection factor since this 
group consisted of all those employed males in an 
age and education range comparable to the executives 
who were at the top intelligence levels in the repre- 
sentative normal sample for the PAT (7), a sample 
which was obtained by the public opinion poll meth- 
ods using an area sampling procedure with quota 
controls. Nevertheless, the possibility exists that 
the executives who returned their tests are not rep- 
resentative of the group as a whole, and that the 
factors determining whether or not the tests were 
completed differed in the executive and professor 
groups. 

The final executive sample consisted of 44 cases, 
over half being company presidents. The businesses 
represented varied from those operating on an in- 
ternational scale to large, primarily local enterprises 
in the major cities. Geographically, the group came 
almost exclusively from the Middle Atlantic, Mid- 
western, and Southern Border States. Twenty-one of 
the 44 subjects are listed in Who’s Who in America. 

The college professor sample of 41 cases came from 
13 institutions varying from small private colleges 
to large state universities. Almost all of the sub- 
jects are full professors, several being department 
heads. The geographical areas represented are 
roughly the same as for the executives with the in- 
clusion of a few cases from the New England States. 
All but seven subjects work in the humanities and 
social sciences, 16 being sufficiently well known to 
be listed in Who’s Who in America. 

The second control group consisted of 25 indi- 
viduals from a number of occupations and from all 
areas of the country. It contains accountants, en- 
gineers, farmers, high school teachers, retail mer- 
chants, sales supervisors, a personnel manager, a 
dentist, a minister, an independent oil-well operator, 
and a lawyer. 


Table 1 


Age and Intelligence of the Executives, Professors, 
and Second Control Group 





t 
SD (i1vs.2) (1 vs. 3) 


Mean 





52.9 7.2 05 91 
53.0 


1. Executives 

2. Professors* 

3. Second control 
group* 

Intelligence (raw score) 

1. Executives** 

2. Professors 

3. Second control 
group 


51.0 


17.5 





* No data available for one case. 
** No data available for five cases. 


Table 2 


Education of the Executives, Professors, and 
Second Control Group 








Years Completed 


Group 13-15 


Executives* 5% 10% 
Professors 0% 0% 
20% 20% 





Second control group 








* No data available for four cases. 


In Tables 1 and 2 we have presented data on the 
age, intelligence, and education of our samples since 
these factors have previously been found to have an 
effect on some aspects of PAT performance (7). 
Neither of the control groups shows a reliable differ- 
ence from the executives in intelligence or age, al- 
though the second control group tends to be some- 
what lower in intelligence and the professors some- 
what higher. The mean intelligence scores for both 
the executives and professors are above the 96th 
percentile for the working population and that for 
the second control group is above the 93rd _per- 
centile. The data on education are somewhat more 
difficult to evaluate. We did not ask specifically for 
information on schooling beyond the bachelor’s de- 
gree. Presumably, all of the professors have had 
graduate work, and we know that at least some of 
the executives and second control group have. This 
comparison is further complicated by the fact that 
most executives spend years at what amounts to on- 
the-job training in business which is comparable to 
the graduate training of the professors but which 
cannot be specified in terms of years of formal edu- 
cation. If, however, the comparison is restricted to 
the number of years of academic education com- 
pleted, the professors may be presumed to have at- 
tained a level two or three years above the average 
executive. The second control group, on the other 
hand, is somewhat below the executives, being more 
heavily weighted with high school graduates and 
those who started but did not complete college 
Thus, the two control groups serve to bracket the 
executives in terms of education, as they do to a 
lesser extent on intelligence; and, consequently, any 
differences between the executives and professors on 
the PAT which are a linear function of education 
should disappear or be reverse? when the second 
control group is used in the comparison. 


Analysis and Results 


Our initial analysis consisted of a pilot 
study carried out on ten executives’ and ten 
professors’ protocols selected at random from 
the two larger samples. Their 20 records 
were then studied in some detail in order to 
isolate those factors which seemed to differ- 
entiate the two groups. On the basis of this 
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preliminary work, seven variables were se- 
lected for further investigation using the 
larger samples. The seven hypotheses were: 


1. Executives will more frequently describe 
the hero in Plate 2 as experiencing some anx- 
iety over an illness. 

2. Executives will more frequently indicate 
that the worker in Plate 3 requires assistance 
in solving his problem with the machine. 

3. Executives will more frequently desig- 
nate the relationship between the worker and 
the foreman in Plate 5 as one involving re- 
ward or praise. 

4. Professors will more frequently suggest 
that on Plate 11 the man in bed is experienc- 
ing anxiety or depression which is not a re- 
sult of any conscious cause. That is, no rea- 
son will be given for the hero’s emotional 
state. 

5. Executives will more frequently elabo- 
rate on the cut finger in Plate 16 in such a 
way as to suggest a more serious injury, such 
as blood poisoning or an amputated finger. 

6. Professors will more frequently describe 
Plate 23 as a scene involving treatment for 
some type of psychiatric condition. 

7. Executives will more frequently attribute 
the anger of the worker in Plate 25 to some 
specific, external cause, such as mistreatment 
by his foreman. 

It should be emphasized that these hypothe- 
ses were not derived from studies carried out 
by other authors or from our own prior ex- 
perience. They were purely empirical deriva- 
tions from the comparison of the ten execu- 
tive protocols with those of the professors. 
In the initial evaluation of these hypotheses, 
we employed the total samples of 44 execu- 
tives and 41 professors. Testing the signifi- 
cance of the differences between frequencies 
in the two samples, we found only two re- 
liable values. These hypotheses, numbers 
one and two, were then tested again using 
samples which did not contain the 20 proto- 
cols included in the pilot study. Thus, we 
eliminated from the comparison those records 
which had been employed in the initial selec- 
tion of variables for investigation in an effort 
to avoid any spurious influence that capitali- 
zation on chance factors in the pilot study 
may have had in producing the two signifi- 
cant differences. Comparing these reduced 
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samples of 34 executives and 31 professors, 
we obtained chi-square values of 5.44 (p= 
.02) and 3.86 (p= .05) for hypotheses one 
and two, respectively. These results indicate 
clearly that the differences obtained in the 
pilot study were, in the case of hypotheses 
three through seven, a function of chance 
fluctuation, but that the first and second hy- 
potheses, having been essentially cross vali- 
dated, referred to reliable differences in the 
two groups. Accordingly, in our further analy- 
sis of these two factors, we have employed 
the larger samples of 44 and 41 cases. 

Hypothesis one, which was initially re- 
stricted to Plate 2 of the PAT (a patient in 
a hospital bed in an apparently unhappy 
mood, in an apparently happy mood, and 
having his temperature taken by a nurse), 
may be considered as indicating fear of ill- 
ness. The patient is described directly as 
worried about his condition, depressed about 
his health, afraid of an impending operation, 
sorry for himself because of his physical con- 
dition, etc.; or he is said to be thinking of 
certain diseases which he might possibly have, 
such as cancer, appendicitis, or some incur- 
able malady. One further alternative occurs 
when in an initial statement the subject indi- 
cates that the patient is worried, depressed, 
glum, afraid, etc., without giving a cause im- 
mediately, while indicating the nature of the 
anxiety at the end of his story by describing 
the man as reassured at finding his tempera- 
ture down or happy again because he feels 
much better. These responses were given by 
61.4% of the executives and only 29.3% of 
the professors in the total samples, the chi- 
square value being 7.54 (p< .01). There 
was, however, no reliable difference between 
the two groups in frequency of references to 
worry, depression, fear, etc. What is charac- 
teristic of the executives is the specification 
of a particular kind of anxiety, namely, that 
which has illness or injury as its object. 

As a further check on these results, the 
same scoring system was applied to Plate 12 
(a patient in a hospital bed in three different 
moods—happy, neutral, and unhappy). Al- 
though this plate elicits less reference to 
physical anxiety in general (38.6% for the 
executives, 17.1% for the professors), the 
difference is still reliable with a chi square of 
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3.91 (p< .05). Finally, in order to deter- 
mine whether these differences could be at- 
tributed to a heightening of physical anxiety 
in the executive group or a lowering in the 
professors, the scoring procedure was applied 
to the second control group and the results 
compared with the executive frequencies. 
The percentages obtained were 12.0 for Plate 
2 (x? = 13.97, p < .001) and 16.0 for Plate 
12 (x? = 2.84, p < .10). Combining the two 
plates (2 and 12), some reference to physical 
anxiety was found in 72.7% of the executive 
records, in 41.5% of the professors, and in 
28.0% of the second control group. 

The second variable was initially found on 
Plate 3 (a man working at a machine, stand- 
ing by his machine, and puzzling over the 
machine). This characteristic is rather diffi- 
cult to denote with a single term, but in gen- 
eral involves a story in which the man is de- 
scribed as facing a problem, usually a broken 
machine or lack of skill in operating the ma- 
chine, which he cannot solve himself and 
which requires the the assistance of another 
person. More specifically, the man is de- 
scribed as waiting for help, watching another 
worker, waiting for the power to come back 
on, or blaming himself because he could not 
solve his problem without getting assistance. 
The reference to a need for help from some 
more experienced person is the critical factor 
here, the potential helper usually being de- 
scribed as a repairman, foreman, mechanic, 
or just someone who can fix the machine. 
Apparently these responses are a substitute 
for a successful solution to the problem and 
not for failure. This supposition is rein- 
forced by the fact that 43.2% of the execu- 
tives respond in this manner and 19.5% of 
the professors (x? = 4.42, p< .05); but if 
the number of “success” responses are added 
in, this difference completely disappears. That 
is, where the college professors tended to see 
the worker as attempting and succeeding in 
the solution of his problem, the executives, in 
a majority of cases, substituted help from an 
outside source, the number of ascribed fail- 
ures remaining the same in the two groups. 

Again this result was checked against other 
plates on the PAT, in this case 9 (a man 
working at a machine, standing by it, and re- 
ceiving instruction) and 15 (a man working 
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happily, working sadly, and standing by his 
machine in an unhappy mood). Neither of 
these plates is interpreted sufficiently often as 
presenting a problem situation which must be 
reacted to in some manner to allow independ- 
ent statistical analysis. Therefore, a “help” 
response was scored if it occurred on either 
of the two plates. The resulting percentages 
were 50.0 for the executives and 24.4 for the 
professors (yx* = 4.83, p < .05). The second 
control group gave percentages of 12.0 for 
Plate 3 and 24.0 for 9 and 15 combined with 
chi squares of 5.85 (p < .02) and 3.45 (p< 
.10), respectively, when compared with the 
executives. As with physical anxiety, the 
“help” response occurred in somewhat over 
30% more executives than college professors 
when all three plates were considered (execu- 
tives, 68.2%; professors, 34.2%; second con- 
trol group, 28.0%). 


Discussion 


Although it appears that we have isolated 
two rather striking characteristics of our ex- 
ecutive group, we are still left with the prob- 
lem of determining what these characteristics 
mean in terms of the total personality con- 
stellations of the individuals involved and 
how they are related to the specific type of 
work environment in which these men op- 
erate. Furthermore, there remains the ques- 
tion of whether these characteristics are pro- 
duced in the job situation by the demands 
and rewards attached to being an executive 
or whether these are long-term factors in the 
individual’s adjustment which may in part 
have contributed to his rise to a position at 
the top of the business hierarchy. 

A possible clue to the answers to these ques- 
tions appears consistently in the results of 
prior studies of the executive group. Henry 
(4) has emphasized a pervasive fear of fail- 
ure which is thought to motivate many ex- 
ecutives in their occupational strivings. Simi- 
lar conclusions have been reached by Gardner 
(2), and Wald and Doty (8). Unfortunately, 
the PAT does not contain pictures which are 
apt to elicit material indicative of such anx- 
ieties and, therefore, cannot offer specific in- 
formation on this factor. It seems probable, 
however, that this fear of failure and the fear 
of illness found in the PAT are essentially 
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the same thing. Either failure or illness 
would block the executive in his search for 
success and achievement. Illness would de- 
prive him of the capacity to satisfy his need 
for activity and would stagnate the organiz- 
ing and decision-making abilities which are 
so crucial to the accomplishments of these in- 
dividuals. It is, of course, possible that this 
fear of- illness may have a basis in reality, 
particularly in view of the age level of the 
group. However, there is no reason why 
there should be more illness among the ex- 
ecutives than the professors. Furthermore, 
Wald and Doty (8) indicate that for their 
sample, which appears to be very similar to 
the present one, 79% of the group reported 
excellent health. It is also noteworthy that 
90% reported that they were not at all or 
only slightly worried about their health. 
This appears to be in contradiction to the 
present findings but in actuality may not be, 
since it seems rather improbable that such a 
group would care to admit fears of injury or 
failure in view of the fact that our culture 
considers such anxieties inappropriate for 
those in responsible positions. It also very 
well may be that these fears rarely, if ever, 
become conscious as such. In fact, it is the 
primary value of projective methods of per- 
sonality measurement that by decreasing the 
degree of ego reference in the response, it is 
possible to obtain information that the sub- 
ject is unable or unwilling to make directly 
available. 

Turning now to our second finding, that the 
executives tend to rely more frequently on 
help from others in solving their problems, 
we find something which appears at first to 
resemble the delegating of tasks to subordi- 
nates which is so characteristic of any high- 
level administrative position. However, the 
repairman or foreman who is introduced as 
the agent here is essentially an expert at the 
job at which the worker is employed and as 
such takes on the appearance of an authority 
or father figure. The responses seem to depict 
the hero as waiting patiently for help with 
his problem rather than as assigning a job to 
a subordinate. This being the case, however, 
the possibility arises that the response indi- 
cates primarily knowledge of the actualities 
of the industrial work situation where repairs 
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are done by mechanics and instructions given 
by foremen. In other words, the executives, 
a number of whom are with industrial con- 
cerns of various types, may be reflecting their 
familiarity with rules which forbid the indi- 
vidual machine operator to attempt to adjust 
or repair his machine. This information 
would be less apt to be available, or to influ- 
ence responses if it were available, among the 
professors and the second control group. To 
check on this possibility, we selected from the 
representative normal sample all those PAT 
protocols which had been completed by semi- 
skilled factory workers, 56 cases in all. Fore- 
men, set-up men, millwrights, and factory me- 
chanics were excluded since these groups are 
not subject to those rules which forbid ad- 
justing and repairing equipment. Our sam- 
ple, then, contained those workers who would 
be most apt to describe the hero as waiting 
for help if in fact this response were a func- 
tion of common factory practices. Analysis 
of the stories given on Plate 3, however, re- 
vealed that only 8.9% of this group gave 
“help” responses which, when compared with 
the executives, yielded a chi square of 13.85 
(p < .001). 

We are inclined to believe, then, that in de- 
scribing his heroes as waiting for help the 
executive is projecting his own quite real feel- 
ings of helplessness in dealing with complex 
job demands and is reflecting the fact that in 
order to solve problems he must rely on the 
assistance of staff personnel, experts, and ad- 
visors who quite generally have more knowl- 
edge of specific problems than the executive 
himself. It is interesting to note here that 
Cohen and Cohen (1) have suggested that 
administrative success is very closely linked 
with the ability to get others to do things that 
one desires, a skill which may well have been | 
learned in early childhood while attempting 
to get one’s wishes satisfied by parents and 
siblings. 

Accordingly, our findings may be consid- 
ered as suggesting a view of executive per- 
sonality and motivation which runs as fol- 
lows. The typical executive is a person who 
suffers from fears of failure and illness and 
who has a rather deep conviction of his own 
helplessness in attempting to solve many of 
the complex problems which face him. On 
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the other hand, he has confidence in his own 
ability to get the cooperation of others in 
solving problems and in carrying out the 
necessary work. He believes that he can ob- 
tain help from experts, specialists, and those 
with a higher position in the company. He 
has developed to a high degree the social skills 
necessary for getting others to do as he 
wishes. Given the presupposition that no one 
man can possess all the knowledge required 
to direct alone a large business enterprise or 
some specific feature of it such as sales or 
production, the success of the executive would 
depend largely on his very unique set of abili- 
ties, primarily social and persuasive in na- 
ture, which are essential to the job. A fear 
of failure may well be among the motivating 
forces which guide the potential executive to 
seek a top position, but once he has gained 
this position, his fear of failure is continually 
reinforced by the belief that success is a 
function of superior intelligence and knowl- 
edge, while feeling that he can never have the 
intelligence to solve all the problems that his 
job presents. Thus, being forced frequently 


to depend on others for information, prob- 
lem solutions, and the actual performance of 


many tasks, he comes to view his position as 
somewhat undeserved and precarious. What 
he fails to recognize, or recognizes only in- 
completely, is that the real requirements of 
his job are not just intelligence and knowl- 
edge, characteristics which many individuals 
may possess, but these latter traits plus the 
ability to use and manipulate the intelligence 
and knowledge of others in such a way as to 
fill the gap between his own intellectual ca- 
pacities and those which the job requires. 

In presenting this picture of the “typical” 
executive, it should be made clear that we are 
delving into the realm of theory and that only 
a rough framework for such a position may be 
derived from the present study. Even this 
may not hold up under future investigation 
since we have not been able to rule out com- 
pletely the possibility that our executive sam- 
ple is a highly selected one and thus not rep- 
resentative of the group as a whole. Further, 
there are some protocols in our executive sam- 
ple which suggest personalities differing widely 
from the “typical” description we have pre- 
sented. Certainly, not all executives possess 
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the characteristics we have noted. Also, in 
applying these findings to individual proto- 
cols, it probably would be advisable to make 
interpretive statements only where the physi- 
cal anxiety or “help” response is given more 
than once in a single record. 


Summary 


The Tomkins-Horn Picture Arrangement 
Test has been employed in a study of the 
personality characteristics of 44 top-level ex- 
ecutives. Two control groups were used, one 
consisting of 41 college professors and the 
other of 25 males of an age, educational, and 
intelligence level similar to those of the execu- 
tives. A qualitative analysis of the verbal 
material aimed at testing seven hypotheses 
developed as the result of a pilot study failed 
to substantiate five of these but did support 
the other two. Thus two characteristics as- 
sociated with the executive occupations ap- 
pear to have been isolated. The first of these 
may be designated as a generalized fear of 
illness, while the second is a tendency to re- 
act to problem situations with a feeling of 
some degree of helplessness and a sense of 
being dependent on others for a solution. 


Received September 23, 1954. 


References 


. Cohen, Mabel B., & Cohen, R. A. Personality as 
a factor in administrative decision. Psychi- 
atry, 1951, 14, 47-53. 

. Gardner, B. B. What makes successful and un- 
successful executives? Advanc. Mgnt, 1948, 
13, 116-123. 

. Guilford, Joan S. Temperament traits of execu- 
tives and supervisors measured by the Guil- 
ford Personality Inventories. J. appl. Psy- 
chol., 1952, 36, 228-233. 

. Henry, W. E. The business executive: the psy- 
chodynamics of a social role. Amer. J. Sociol., 
1949, 54, 286-291. 

. Meyer, H. D., & Pressel, G. L. Personality test 
scores in the management hierarchy. J. appl. 
Psychol., 1954, 38, 73-80. 

. Thorndike, R. L. Two screening tests of verbal 
intelligence. J. appl. Psychol., 1942, 26, 128- 
135. 

. Tomkins, S. S., & Miner, J. B. Contributions to 
the standardization of the Tomkins-Horn Pic- 
ture Arrangement Test: plate norms. J. Psy- 
chol., 1955, 39, 199-214. 

. Wald, R. M., & Doty, R. A. The top executive— 
a firsthand profile. Harvard Bus. Rev., 1954, 
32, 45-54. 





The Journal of Applied Psychology 
Vol. 39, No. 5, 1955 


Group Performance in a Cognitive Task 
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A series of previous articles (1, 2, 3) de- 
scribed the results of experiments with a 
group manual dexterity task... The present 
experiment is an attempt to determine the ex- 
tent to which performance on a group cog- 
nitive task can be predicted from knowledge 
of individual performance on a similar task. 


Procedure 


Four crossword puzzles of approximately equal 
difficulty were constructed. Difficulty was estimated 
on the basis of the number and difficulty of the 
words used to make up the puzzles. The codes, or 
lists of word definitions, for each puzzle were printed 
on a sheet separate from that of the puzzle work- 
sheet, or outline, itself. For the two puzzles used 
as individual tasks, puzzles I-1 and I-2, both hori- 
zontal and vertical codes were printed on the same 
sheet, while for the puzzles used as a group task, 
puzzles G-1 and G-2, each horizontal and each verti- 
cal code was printed on a separate sheet of paper. 
Thus, for the four puzzle outlines there was a total 
of six code sheets.” 

A sample of 120 male subjects was drawn from 
undergraduate psychology courses. Each “group” 
studied consisted of two men. Some pairs were com- 
posed of friends, but in the majority of cases, the 
individuals were either unknown to each other or 
were only casually acquainted. 

The subjects in each pair were seated at one corner 
of a table, one on either side of the corner. After 
establishing that each subject understood how a 
crossword puzzle is solved, each member of the 
group was handed a copy of the first individual 
puzzle with instructions to complete it as best he 
could in the 10 minutes allotted. After the first 10 
minutes, each member was given the second indi- 
vidual puzzle, again with the same instructions. 

Following the completion of two trials of indi- 
vidual performance, the first of the two group puzzle 
outlines was placed between the two subjects. One 


1 This research was supported by a grant from the 
University of California at Los Angeles. 

2 The four puzzle outlines and the six code sheets 
have been deposited with the ADI Auxiliary Publi- 
cations Project. Order Document No. 4649 from 
Chief, Photoduplication Service, ADI Auxiliary Pub- 
lications Project, Library of Congress, Washington 
25, D. C., remitting $1.25 for microfilm (images 1 
inch high on standard 35 mm. motion picture film) 
or $1.25 for photocopies (6 X 8 inches) readable with- 
out optical aid. Advance payment is required. 
Make checks or money orders payable to: Chief, 
Photoduplication Service, Library of Congress. 


subject was given the horizontal code for the puzzle 
while the other was given the vertical code sheet. 
The experimenter then gave the following instruc- 
tions: 


“This puzzle is very similar to the two you have 
just completed, except that this time Subject A has 
the horizontal code and Subject B has the vertical 
code, while there is only one outline between you. 
You are to work together to fill in the puzzle. You 
may converse as much as you care to, but you may 
not show your partner your own code sheet, nor in 
any way tell him what is printed upon it. The 
puzzle will be scored by the number of squares cor- 
rectly filled in, so make sure you have only one 
letter in each. Are there any questions? You have 
10 minutes.” 

After the 10 minutes the subjects were asked to 
change seats and the same instructions were read for 
the second group puzzle. This time, however, Sub- 
ject A was given the vertical code while B used the 
horizontal code. 

This completed the experimental session, usually 
taking 45 minutes. Upon leaving the room each 
subject was handed a Guilford-Zimmerman Tempera- 
ment Survey (4) and requested to return it the fol- 
lowing week. 

The method of analysis was similar to that used in 
the previous studies (1, 2, 3). The steps were as 
follows: 


1. For each person, the individual scores were com- 
puted for the individual puzzles I-1 and I-2 by 
counting the number of squares correctly filled in. 
These “alternate form” scores were used to obtain 
reliability measures and were averaged to give an 
index of each person’s individual score. The same 
procedure was followed for the “group” scores, ex- 
cept that group scores were for pairs rather than for 
single individuals. 

2. Each member of each pair was designated as 
“high” or “low,” respectively, on the basis of his 
mean individual performance score in relation to 
that of the other member of his pair. Reliabilities 
for high, low, and group scores were determined by 
correlating scores of the alternate forms for the vari- 
ous categories. 

3. Pearson intercorrelations were computed be- 
tween high, low, and group scores. The correlations 
of the high and low scores with the group scores 
were corrected for attenuation in both variables in 
order to estimate the values which might be ex- 
pected between the given variables had the measures 
involved been entirely free of errors of measurement. 

4. Beta weights and the multiple correlation co- 
efficient for predicting group scores from high and 
low scores were computed. 
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5. Phi coefficients for each of the 10 factors of the 
Guilford-Zimmerman Temperament Survey with the 
group scores of “high” individuals, “low” individu- 
als, and with all group scores were computed. In 
this third case, each group score was duplicated in 
that two persons shared each group score. 

6. A Pearson correlation coefficient between uni- 
versity grade-point averages and all group scores 
was computed. That is, each person had both a 
group score (identical with that of his partner) and 
a grade-point average. These scores were correlated. 


Results 


The analysis of réliability of the two alter- 
nate forms of the individual task (puzzle I-1 
and I-2) yielded a coefficient of .79 while the 
coefficient of the two group tasks was .96. 

The results of the statistical analysis of 
Steps 1 through 4 have been summarized in 
Table 1. 

The analysis of the relationship between 
the 10 factors of the Guilford-Zimmerman 
Temperament Survey and the group scores 
resulted in only one significant phi coeffi- 
cient. This was the correlation between “low” 
individuals’ group scores and the factor of 
“Thoughtfulness.” Since 30 phi’s were com- 
puted, however, i.e., high and low groups 
separately and then pooled, the one signifi- 
cant case may be accounted for on the basis 
of chance. The correlation between grade- 
point average and group scores resulted in a 
coefficient of .10. It was therefore consid- 
ered unnecessary to compute the proportion 
of variance of group scores determined by 
these variables. 


Table 1 


Summary of Results 





Beta 
Wt. 





High Low Group 
7 25 76" 63 
6 67* 51 


Group .96 
R? = 82 








R= .91 





* These correlation coefficients have been corrected for atten- 
uation in both variables, a step undertaken before they were 
used to obtain beta weights. This was‘done to estimate the 
extent of the association in the absence*of errors of measure- 
ment. The “high-low’’ correlation was not corrected because 
it represents merely an artificial restriction in the regression 
due to the method of selecting individuals for the high and low 
groups. 


Discussion 


The results of the present experiment are 
particularly interesting when contrasted with 
those of previous articles concerned with a 
manual task (1, 2, 3). In these earlier ex- 
periments less than half of the total variance 
on the group performance task could be pre- 
dicted on the basis of how well the team mem- 
bers performed individually (R* = .38, .44, 
49). Yet, in the present experiment, the 
square of the multiple correlation coefficient, 
noted in Table 1, indicates that 82 per cent 
of the variance in an errorless measure of 
group performance could be predicted from a 
linear combination of perfectly reliable high 
and low individual scores. While it is ap- 
parent that in this and the three previous ex- 
periments the group task consisted ostensibly 
of a recombination of the elements of the in- 
dividual task plus variables in the area of co- 
operation between the two individuals, it is 
also evident that a qualitative difference exists 
between the present cognitive group task and 
the previous manual group tasks. It would 
seem, therefore, that the group cognitive task 
selected for this experiment did not elicit 
nearly as much interaction as did the manual 
group tasks.® 

The foregoing considerations lead us to the 
conclusion that success in predicting group 
performance from knowledge of individual 
performance is highly dependent upon the na- 
ture of the task involved. This is not to say, 
of course, that group manual tasks necessarily 
involve more social interaction than group 
cognitive tasks of all kinds. It must be 
pointed out, however, that simply because 
two or more individuals are seated together, 
apparently working together, does not neces- 
sarily imply that cooperation is a highly sig- 
nificant factor. The cooperative nature of the 
task should be empirically demonstrated be- 
fore proceeding to investigate the relationship 
of other variables, e.g., leadership, democratic 
vs. authoritarian structure, etc., to the so- 


8 The term “interaction” is used here to designate 
all that true group performance variance which can- 
not be predicted from true individual performance 


variance. In this sense it may include individual 
skills demanded by the group situation as well as 
effects resulting from particular combinations of 
persons. 
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called cooperative task variable. Experiments 
purporting to study group phenomena have 
not always rested on firm ground in this 
respect. 


Summary 


Sixty pairs of male undergraduate students 
were given two crossword puzzles to solve as 
individuals and two to solve cooperatively. 
Members of each pair were divided on the 
basis of the mean of the two individual trials 
into “high” and “low” categories. Reliabili- 
ties were determined for individual and group 
performances as well as for high and low per- 
formances. Correlations of high and low per- 
formances with the group performance and 
with each other were computed and in cer- 
tain cases corrected for attenuation. Regres- 
sion weights and the multiple correlation co- 
efficient were obtained for predicting group 
performance from high and low individual 
performances. 

Eighty-two per cent of the true group per- 
formance variance on this task could be pre- 


Andrew L. Comrey and Carolyn K. Staats 


dicted from a knowledge of errorless measures 
of individual performance. Correlations be- 
tween grade-point average or factors of the 
Guilford-Zimmerman Temperament Survey 
and group performance were negligible. It is 
suggested that the nature of the task is im- 
portant in studies of group performance and 
that the existence of “group” phenomena 
should be empirically demonstrated rather 
than assumed. 


Received February 26, 1955. 
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The purpose of this experiment was to de- 
termine the relative agreement between the 
Cleeton Vocational Interest Inventory (Form 
A) and a revised form of the Cleeton in which 
the items are in random order. The strong- 
est and most frequent criticism of the Cleeton 
concerns the grouping of the items by occu- 
pational families (1, p. 650; 2; 3, p. 472). 

Although the standard form greatly facili- 
tates scoring, those who have used and studied 
the Cleeton are unanimous in their misgivings 
about the item grouping. It could be hy- 
pothesized that subjects taking the standard 
form are inclined to respond with reference to 
the occupational scales rather than the indi- 
vidual items. If this analysis were correct, 
the scores on the revised form should differ 
significantly from those on the standard. 


Procedure 


The revised form was constucted by randomly 
drawing items from the standard form and number- 
ing them according to the order in which they were 
drawn. The subjects (Ss) were 120 male students 
at the School of Commerce, New York University. 
All were members of evening classes in elementary 
psychology. The median age of both control and 
experimental groups was 28. All Ss were adminis- 
tered the Kuder Preference Record, and then after 
a short rest, one of the Cleeton forms. Half of the 
Ss completed the standard form, and the other half 
completed the revised form.. It was assumed that 
the reliability of the Cleeton would not be markedly 
altered by revision. 


Table 1 


Analysis of Variance of Cleeton Measures 








Source j MS 





26.33 
1324.56 
253.95 
139.49 


Cleeton forms 
Cleeton scales 
Treatments X scales 
Within groups 





1 This material is derived from the author’s dis- 
sertation submitted in April, 1953, in partial satis- 
faction of the requirements for the M.A. degree at 


New York University. The dissertation was di- 


rected by Dr. R. Gilbert. 


Results 


The data were treated by an analysis of 
covariance technique in which the Kuder was 
used as a statistical control for initial inter- 
est; however, the experimental and control 
groups showed no differences in means and 
variances on the Kuder. Thus, only the sim- 
ple analysis of variance of the Cleeton meas- 
ures is presented in Table 1. Qnly eight 
Cleeton scales were utilized because the origi- 
nal design of the study necessitated their be- 
ing comparable with the Kuder scales. 


Discussion 


The results indicate that the standard, 
easily scored form of the Cleeton is responded 
to no differently than the randomized form. 
This finding would seem to suggest that the 
Ss respond to the item rather than to the 
scale in which it is placed. It would be inter- 
esting to determine whether the same results 
would be obtained with younger Ss and under 
conditions of high motivation. Similar find- 
ings under these conditions would lend greater 
generality to the results. 


Summary 


This study of the Cleeton Vocational Inter- 
est Inventory was conducted with 120 male, 
evening students at New York University. 
Initially, all Ss completed the Kuder Pref- 
erence Record. Subsequently, half were ad- 
ministered a revised form of the Cleeton in 
which the items were randomized. Analysis 
of the results indicated that taking the dif- 
ferent forms did not result in a differential 
effect upon the scores. 
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Psychology * 


Eugene S. Mills 
Whittier College 


In recent years psychologists have wit- 
nessed a gradual merging of two areas of in- 
terest within their field. Increasingly the 
teaching of psychology and the study of stu- 
dent personality have become related in the 
efforts of both the academic psychologist and 
the clinician. This rapprochement is appar- 
ent in many articles dealing with the function 
of psychology courses in college curricula (1, 
2, 11, 13) as well as in those which treat 
more directly the relation between the stu- 
dent’s personality and the study of psychol- 
ogy (4, 6, 10, 12). Although it may be gen- 
erally conceded that “there is some kind of 
reciprocal relation between intellectual de- 
velopment and emotional adjustment” (13, 
p. 40), much basic research is still needed in 
order to understand more fully the nature of 
the student’s experience in psychology courses. 

While some research has dealt with student 
evaluation of particular teaching techniques 
(3) and with the influence of personality fac- 
tors in the learning of psychological material 
(6), relatively little attention has been given 
to the selective function of psychology courses 
in the college curriculum. Elsewhere the 
writer (7) discussed the selective function of 
the abnormal psychology course and pre- 
sented evidence of the importance of emo- 
tional factors in this selective process. The 
demonstration that rather important emo- 
tional factors may be related to the decision 
‘to study abnormal psychology leads one to 
question whether participation in the course 
may not also have an emotional impact on the 
student. In an attempt to provide some an- 
swers to this question, the following hypothe- 
sis was investigated: The study of abnormal 
psychology has a measurable effect upon the 
personality adjustment of some college stu- 
dents. 


1 Papers reporting certain aspects of this research 
were read before the 1954 meetings of the Western 
Psychological Association in Long Beach, California, 
and the American Psychological Association in New 
York City. 


Method 


Subjects. The subjects (Ss) of the study were 
chosen from the enrollment in two undergraduate 
courses in the curriculum of a liberal arts college. 
The experimenter (E£) had no official connection 
with either course. Group A was composed of 10 
men and 11 women students selected from a course 
in abnormal psychology according to the following 
criteria: 

1. The student must not be currently enrolled in 
any other course in psychology, sociology, or crim- 
inology. 

2. The student must not at that time be under- 
going any psychological or medical therapy. 

3. The student must have completed at least two 
years of college. 


The Ss in Group A had a mean age of 22 years and 
a median age of 21 years. 

Group B was composed of 1i men and 11 women 
students selected from a modern European history 
course. The criteria for the selection of the Ss in 
Group B were the same as those for Group A, the 
only exception being that none of the Ss was en- 
rolled in any psychology course. The mean age for 
Ss in Group B was 20.7 years, with a median age of 
20 years. 

Procedure. At the beginning of and just prior to 
the end of the semester the Rorschach test was ad- 
ministered individually to all Ss in both groups and 
scored according to the Klopfer system (5). In or- 
der to derive adjustment ratings which were largely 
objective and which were sufficiently consistent to 
lend themselves to test-retest comparison, the Mun- 
roe Inspection Technique (10) was utilized. A sub- 
analysis of certain Rorschach scores was made, in- 
cluding a study of the erlebnistyp or introversion- 
extraversion relationship in each record. 

A story-completion test for college students (8) 
was administered to all Ss in both groups at the be- 
ginning and again at the close of the semester. This 
test was patterned after the Madeleine Thomas Com- 
pletion Stories Test (9) and consists of a series of 
story items dealing with critical areas of student 
life. Typical story items are: 


a. A student is in college. He writes a letter home. 
What does he say in the letter? 

b. This student answers a question incorrectly in 
class. In the hall after class another student says to 
him: “You sure messed that one up, didn’t you?” 
What does he do? 

c. The student has been going steady with a cer- 
tain girl. One night she refuses to go out with him. 
Why? 
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Table 1 


Statistical Summary of Munroe Check-List Data for 
Groups A and B on Initial Test and Retest 








Initial Test Retest 


Group Group Group Group 
Statistic A B A B 





Mean 14.76 
Median 16.00 
SD 5.67 
SE Mean 1.04 .68 

SE Difi. 1.64 89 
t as 1.82 


10.22 
11.20 
2.70 
1.26 58 


14.60 
14.00 8.10 
4.52 3.13 


8.59 





A theme analysis was made of all story completion 
material. 

In addition to the above operations, a question- 
naire was administered to all students on the oc- 
casion of the second class meeting and a revised 
form was administered again at the close of the se- 
mester’s work. The questionnaire provided informa- 
tion which could be used as a background for con- 
sidering test results. Items were included which 
touched on the student’s motivation for enrolling in 
the history or psychology course, his expectations 
regarding the possible effect of the course upon his 
own life, his vocational and marital plans, and atti- 
tudes toward home. 

All Ss in both groups wrote an autobiography at 
the beginning of the semester. In requesting auto- 
biographical material, E left the situation unstruc- 
tured, asking only that each S submit some account 
of his life. No autobiographical material was ob- 
tained at the close of the semester’s study. 


Results 


When judged according to the Munroe In- 
spection Technique, the Group A Rorschachs 
received a total of 311 checks on initial test 
and 292 checks on retest. Table 1 reveals 
the ratio of the difference between the two 
means to be .13.* This suggests that no sta- 
tistically significant difference exists between 
the number of Munroe checks on intial test 
and retest. 

The Group B Rorschachs received a total 
of 225 Munroe checks on initial test and 189 
checks on retest. Table 1 reveals the ratio 


2 The initial test data include one Rorschach rec- 
ord which received 34 Munroe checks. The S pro- 
ducing this record became severely disturbed and 
dropped out of school during the semester in which 
the research was done. Excluding this case from the 
data, calculations reveal a ¢ ratio of 1.03 between 
the mean checks for initial test and retest. 
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of the difference between the means to be 
1.82. This suggests that no statistically sig- 
nificant change occurred in the number of 
Munroe checks for the Group B Rorschachs 
on initial test and retest.* 

An examination of the Rorschach mate- 
rial was made in such a manner as to reveal 
possible changes in introversive-extraversive 
orientation on the part of the Ss in both 
groups. From the M:sum C and (FM + 
m):(Fco+c+C’) ratios, the percentage of 
responses to the last three cards, and from 
general qualitative indications, two judges 
rated on a five-point scale the degree of in- 
troversion-extraversion for each S. Statisti- 
cal analysis of these initial test-retest ratings 
failed to reveal any significant group changes. 

Two judges made a theme analysis of the 
story completion material. They were not 


Table 2 


Frequency of Occurrence of Story Completion Themes 
on Initial Test and Retest: Anxiety, 
Escape, Depression 








Initial Test Retest 


Themes : B A B 


Anxiety 13 26 20 
Escape 10 12 15 
Depression 15 12 9 





informed of the particular group to which an 
S belonged, nor were they able to determine 
whether a set of stories was obtained on 
initial test or retest. A total of 25 different 
themes was detected in the story completions 
and these were subjected to statistical analy- 
sis. Only two themes differed significantly in 
frequency of occurrence on initial test and 
retest. Table 2 reveals a significantly smaller 
number of escape and depression themes in 
the retest responses of the Group A Ss (¢ of 
3.41 and 3.62, respectively). The marked 
anxiety in the Group A responses at the time 
of the initial test continued undiminished on 
retest. This was apparent in the Rorschach 


3 Statistical comparison of the mean number of 
Munroe checks for Groups A and B on retest re- 


vealed a ¢ ratio of 4.52. The significantly larger 
number of Group A Munroe checks compares favor- 
ably with the ¢ ratio of 3.33 derived from initial test 
differences. 
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records as well as in the story completion 
thematic material. The significantly larger 
number of m, k, and K responses present in 
the Group A records at the beginning of the 
semester was also present at the time of the 
retest. 

A subanalysis of Rorschach data supports 
the theme analysis in showing an apparent 
decrease in depressive tendencies for the 
Group A Ss at the end of the semester’s 
study. While the total Rorschach picture 
must be considered in order properly to evalu- 
ate the presence of depressive tendencies in 
the S, a study was made of the achromatic 
vs. bright-color responses in each record. 
Klopfer and Kelley hold that “achromatic 
responses in general (Fc, c, and C’) are 
indicative of depressive tendencies only if 
they outnumber bright-color responses (FC, 
CF, and C) at least two to one” (5, p. 243). 
Taking this observation as a lead, study of 
the Rorschach records revealed eight Group 
A cases in which achromatic responses out- 
numbered bright-color responses by at least 
two to one at the beginning of the semester, 
while only four cases were found on retest. 
Two such cases were found among the Group 
B Ss on both initial test and retest. 

Analysis of the initial test data revealed a 
greater expressed interest in, and concern 
over, sexual matters on the part of the Group 
A Ss. This was reflected in the results of 
the story-completion theme analysis with 56 
erotic themes in the Group A responses and 
30 in the Group B responses. The number 
of retest themes remained essentially the same 
with 55 for Group A and 30 for Group B. 
The initial-test Rorschach records contained 
17 sex responses for Group A and only one 
for Group B, while the retest records con- 
tained 12 and two sex responses, respectively. 
These data, plus a qualitative study of the 
test material, lead one to conclude that no 
important Group A change occurred in this 
area during the course of the semester. 

Most of the psychology students gained a 
new awareness of their own personal problems 
and demonstrated an increased freedom in 
discussing them. This was apparent in their 
own evaluation of the course experience and 
in the general analysis of retest material. 


S. Mills 


There was no unanimity present, however, in 
the feelings occasioned by this new awareness 
of problems. While some were openly re- 
lieved by their new discoveries, others ap- 
peared genuinely disturbed. 

Analysis of the story-completion test re- 
sponses provides evidence of the psychology 
student’s increased freedom in recognizing 
and discussing personal problems. While al- 
most all of the story items are written in the 
third person singular, occasionally Ss are suffi- 
ciently preoccupied with the conditions of 
their own personal lives to write their re- 
sponses in the first person. A tabulation was 
made of the number of times the Group A 
and Group B Ss completed items in the first 
person or made personal references elsewhere 
in the test than in the stories proper. There 
were 41 first-person responses and personal 
references in the Group A initial-test com- 
pletions, and 37 in those obtained from Group 
B Ss. However, while the number dropped to 
31 for Group B on retest, the number for 
Group A jumped to 156. The increase for 
Group A from 41 to 156 is statistically sig- 
nificant at the 1% level. 

A follow-up check was made of the grades 
received by Ss in both groups. Of the five 
“A” grades received by members of the ab- 
normal psychology class, four were received 
by Ss found to deviate most markedly 
throughout the test data. The Munroe scores 
for these Ss are among the highest eight 
scores on initial test (poorest adjustment) 
and among the highest seven on retest. A 
Spearman rho of .62 was obtained between 
the Munroe indices of adjustment and final 
grades in the abnormal psychology course. 
This rather high rho supports the belief that, 
at least in one situation, success in dealing 
with psychological material was greater for 
those students who manifested more personal 
anxiety and concern over their own problems 
of adjustment. A rho of — .15 was obtained 
between the Munroé indices of adjustment 
and inal grades in the history class. 


Summary 


Personality tests and questionnaire data 
were obtained from psychology and history 
students at the beginning and end of a se- 
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mester’s study. This material was subjected 
to both quantitative and qualitative analysis 
to ascertain whether significant test-retest dif- 
ferences existed which might be attributable 
to course experience. 

The following results were obtained: 


1. When evaluated according to the Mun- 
roe Inspection Technique, the Rorschach rec- 
ords revealed no statistically significant test- 
retest changes in adjustment for either group. 

2. Theme analysis of the psychology stu- 
dents’ responses to the story-completion test 
revealed a significantly smaller number of es- 
cape and depression themes on retest (¢ of 
3.41 and 3.62, respectively). Twenty-three 
other themes tested showed no significant dif- 
ference on retest. 

3. The psychology students demonstrated 
an increased freedom in recognizing and dis- 
cussing personal problems. 

4. The expressed interest in, and concern 
over, sexual matters on the part of the psy- 
chology students at the beginning of the se- 
mester remained undiminished at the time of 
retest. 

5. A Spearman rho of .62 was obtained be- 
tween the Munroe indices of adjustment and 
final grades in the abnormal psychology 
course with a tendency’ for the more poorly 
adjusted students to make the higher grades. 
A rho of — .15 was found between Munroe 
indices and final grades for the history stu- 
dents. 


Conclusions 


The findings support the belief that, (a) 
as a group, students studying abnormal psy- 
chology differ in personality from those not 
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enrolled in the course, and (6) while the 
study of abnormal psychology has little im- 
mediate effect upon personality adjustment, 
measurable changes occur which appear to be 
related to the course experience. 
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Anthony C. LaBue 
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Although case studies have shown that 
personal maladjustment often underlies vo- 
cational dissatisfaction and frequent job 
changes, stability in an occupation, job satis- 
faction, completion of training, or similar 
criteria have not been common subjects of 
study by means of personality inventories 
(3). 

In a study by Seagoe (2), it was found 
that there were no significant differences be- 
tween those student-teachers who remained 
in teacher training and those who left it, al- 
though there was a tendency for the well- 
adjusted, as measured by the Bernreuter Per- 
sonality Inventory, at any level of intelli- 
gence, to remain in training, whereas the 
maladjusted of superior mental ability gave 
up teaching as a vocational choice. 

A part of the writer’s doctoral dissertation 


(1) was concerned with identifying differ- 
ences in personality traits between students 
who persisted and those who did not persist 
in their interest in teaching as a vocational 


choice. Students in the persistent group 
were defined as those who had completed a 
program of teacher preparation and had, upon 
graduation from Syracuse University, ac- 
cepted a teaching position. The nonpersist- 
ent students were defined as those who had 
made application for admission to a program 
of teacher preparation but did not enroll in 
such a program. 


Method 


Sample. The study involved four groups of sub- 
jects (Ss), two groups of women and two groups of 
men classified into the following categories: persist- 
ent females, nonpersistent females, persistent males, 
and nonpersistent males. There were 50 Ss in each 
of these groups, except for the nonpersistent male 
group which was made up of a total sample of 49 
cases. The 50 Ss in each of the first three groups 
had originally been randomly selected from total 
samples of 128, 90, and 96, respectively. For the 
purpose of this report, data were available on 50 
cases in the persistent female group, 49 cases in the 
nonpersistent female group, 47 cases in the persistent 


male group, and 28 cases in the nonpersistent male 
group. 

Procedures. The instrument upon which data were 
based was the Minnesota Multiphasic Personality 
Inventory which was administered to each of the Ss 
at the time of application for admission to a pro- 
gram of teacher preparation. The scores used for 
analysis in this report were the standard scores on 
the following nine diagnostic scales: hypochondriasis, 
depression, hysteria, psychopathic deviate, mascu- 
linity-femininity, paranoia, psychasthenia, schizo- 
phrenia, and hypomania. In analyzing the data, t 
ratios were computed to test the mean differences 
between the female groups and between the male 
groups. In addition, point-biserial correlation coeffi- 
cients were computed to determine the relationship 
between scores on the nine diagnostic scales and per- 
sistence of interest in teaching as a vocational choice 
(a nonvariable dichotomy). The findings which fol- 
low are reported separately for the women and for 
the men in the two groups under consideration (per- 
sistent and nonpersistent). 


Results 


Differences between the persistent and 
nonpersistent females. Table 1 shows the 
mean standard scores and standard deviations 
on nine diagnostic scales of the MMPI for 
the persistent and nonpersistent female group." 

Table 1 also shows, in terms of ¢ ratios, 
the significance of the mean differences be- 
tween the persistent and nonpersistent fe- 
male group with respect to the nine diagnostic 
scales of the MMPI. At the 1% level of con- 
fidence, these two groups differed significantly 
with respect to hypomania and psychopathic 
deviation. Although the nonpersistent fe- 
males show greater tendency toward hypo- 
mania and psychopathic deviation, an inspec- 
tion of mean scores reveals that on both 
these scales they fall within the “normal” 
range. However, a further analysis of the 
data revealed that, as contrasted with the 
women in the persistent group, a significantly 
larger percentage of women in the nonper- 


1In the tables which follow, the letters PF refer 
to the persistent females; NPF, nonpersistent fe- 
males; PM, persistent males; and NPM, nonpersist- 
ent males. 
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Table 1 
Mean Standard Scores and Their Standard Deviations of Nine Diagnostic Scales of the MMPI 
for the Persistent and Nonpersistent Female Groups 
(PF: N = 50; NPF: N = 49) 











PF NPF 
MMPI Scale SD 


Mean Mean 





48.62 
47.12 
54.48 
53.06 
47.94 
52.10 
51.98 
52.48 
55.84 


Hypochondriasis 
Depression 

Hysteria 

Psychopathic deviate 
Masculinity-femininity 
Paranoia 
Psychasthenia 
Schizophrenia 
Hypomania 


5.30 
5.73 
5.94 
8.07 
7.47 
7.34 
5.60 
7.16 
9.33 


52.06 
49.65 
56.84 
57.88 
49.10 
54.31 
55.31 
56.63 
61.29 


2.4927* 
1.7692 
1.5628 
2.7079** 
6554 
1.3393 
2.1210* 
2.5305* 
2.7665** 





* Significant at less than the 5% level. 
** Significant at less than the 1% level. 


sistent group had scores of 70 (two SD’s 
above the mean) or higher. 

At the 5% level of confidence, the persist- 
ent and nonpersistent female group differed 
with respect to hypochondriasis, psychas- 
thenia, and schizophrenia. The mean scores 
for both groups on these three scales fell 
within the “normal” range, but in each case 
the higher mean scores were obtained by the 
nonpersistent females and a greater percent- 
age of them, as contrasted with the persistent 


Differences between the persistent and non- 
persistent males. Table 2 shows the mean 
standard scores and SD’s on nine diagnostic 
scales of the MMPI for the persistent and 
nonpersistent male group as well as the sig- 
nificance of the mean differences in terms of 
t ratios. It can be seen that at the 5% level 
of confidence these two groups differed sig- 
nificantly with respect to one scale, psycho- 
pathic deviate. An inspection of the mean 
scores on this scale for these groups reveals 


females, scored at 70 or higher on these three 


that they fall within the “normal” range. 
scales. 


However, the higher mean score was obtained 


Table 2 


Mean Standard Scores and Their Standard Deviations of Nine Diagnostic Scales of the MMPI 
for the Persistent and Nonpersistent Male Groups 


(PM: N = 47; NPM: N = 28) 








PM NPM 


MMPI Mean SD Mean SD 








13.58 
11.89 
10.65 
10.05 
12.09 

6.57 
11.70 

9.94 
11.09 


53.00 
55.11 
57.61 
57.75 
65.25 
52.93 
57.61 
57.36 
60.96 


Hypochondriasis 
Depression 

Hysteria 

Psychopathic deviate 
Masculinity-femininity 
Paranoia 
Psychasthenia 
Schizophrenia 
Hypomania 


51.68 
51.17 
56.11 
52.17 
63.45 
54.40 
54.02 
54.19 
58.70 


7.01 
8.67 
5.97 
8.55 
9.06 
6.67 
7.97 
8.82 
8.37 


1.6349 
.7692 
2.5247* 
.7200 
9131 
1.5541 
1.4152 

.9912 





* Significant at less than the 5% level. 
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by the nonpersistent group and a greater per- 
centage of the students in this group, as con- 
trasted with students in the persistent group, 
scored at or above 70. Although the mean 
differences on the remaining scales were not 
significant, the higher mean scores were ob- 
tained by the nonpersistent male group, ex- 
cept for the paranoia scale where the higher 
mean score was obtained by the persistent 
male group. 

The relationship between personality traits 
and persistence of interest in teaching as a 
vocational choice. Table 3 shows that for 
the female groups five correlation coefficients 
were high enough to be significant at either 
the 1% or 5% level of confidence. At the 
1% level, there was a significant but nega- 
tive relationship between psychopathic devia- 
tion and persistence and between hypomania 
and persistence. In other words, high scores 
on these scales indicate lack of persistence of 
interest in teaching as a vocational choice 
among females. At the 5% level of confi- 
dence, the following traits were negatively re- 
lated to the criterion: hypochondriasis, psy- 
chasthenia, and schizophrenia. 

Table 4 shows that only one correlation co- 
efficient for the male groups was high enough 
to be significantly related to the criterion. 
There was a significant but negative relation- 
ship (5% level of confidence) between psy- 


Table 3 


The Relationship Between Personality Traits as Meas- 
ured by the MMPI and Persistence of Interest 
in Teaching as a Vocational Choice 
Among Females 


(N = 99) 








Point Biserial 


Personality Trait Coefficient 





Hypochondriasis 
Depression 

Hysteria 

Psychopathic deviate 
Masculinity-femininity 
Paranoia 
Psychasthenia 
Schizophrenia 
Hypomania 


— .243* 
—.174 
~~ 156 
— .263** 
— .066 
— 132 
— .210* 
— .248* 
—.270** 





* Significant at less than the 5% level. 
** Significant at less than the 1% level. 
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Table 4 


The Relationship Between Personality Traits as Meas- 
ured by the MMPI and Persistence of Interest 
in Teaching as a Vocational Choice 
Among Males 


(N = 75) 








Point Biserial 


Personality Trait Coefficient 





Hypochondriasis 
Depression 

Hysteria 

Psychopathic deviate 
Masculinity-femininity 
Paranoia 
Psychasthenia 
Schizophrenia 
Hypomania 


— .064 
— .187 
— .090 
— .282* 
— .084 
.160 
.166 
163 
115 





* Significant at less than the 5% level. 


chopathic deviation and persistence of inter- 
est in teaching as a vocational choice among 
males. The remainder of the correlation co- 
efficients were also negative, but not high 
enough to be significant. 


Summary 


The findings described above seem to sup- 
port the following generalizations and con- 
clusions: 


1. Significant differences exist between fe- 
male students who persist and those who do 
not persist in their interest in teaching as a 
vocational choice with respect to personality 


traits as measured by the MMPI. As con- 
trasted with the persistent females, the non- 
persistent females have a significantly greater 
tendency toward psychopathic deviation, psy- 
chasthenia, schizophrenia, hypomania, and 
hypochondriasis. 

2. A number of personality traits of the fe- 
male students are significantly but negatively 
related to the criterion, persistence of inter- 
est in teaching as a vocational choice. High 
scores on the hypochondriasis, psychopathic 
deviate, psychasthenia, schizophrenia, and 
hypomania scales indicate lack of persistence 
of interest in teaching as a vocational choice. 

3. Although a number of personality traits 
of the female students as measured by the 
MMPI are significantly related to the cri- 
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terion, there is no one trait, taken singly, that 
could be used to predict persistence of inter- 
est in teaching as a vocational choice. The 
correlation coefficients are too small to be 
used for prediction purposes. 

4. With respect to the male groups, the 
MMPI does not, to any great extent, differ- 
entiate between those men who persist and 
those who do not persist in their interest in 
teaching as a vocational choice. On only one 
trait, psychopathic deviate, was there a sig- 
nificant difference between the two groups. 
This trait was also significantly but negatively 
related to the criterion. 

In conclusion, it seems evident that the 
real value of the MMPI is in clinical rather 
than in vocational diagnosis. There is need 
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for confirming some of its possible vocational 
implications. More studies of stability in an 
occupation, job satisfaction, completion of 
training, or similar criteria: by means of per- 
sonality inventories are needed, to throw more 
light on the dynamics of vocational choice 
and adjustment. 
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In the original standardization of the Bar- 
ron-Welsh Figure-Preference Art Scale (1), 
the criterion for item selection was that artists 
should prefer a given item significantly more 
frequently than it was preferred by non- 
artists. The cross validation of the scale 
consisted of a comparison of groups defined 
in exactly the same fashion, with the evi- 
dence for validity being that artists obtained 
a significantly higher score on the scale than 
did nonartists. 

An important question to which Barron and 
Welsh did not address themselves is this: In 
a sample of artists, do scores on the scale cor- 
relate significantly with the merit of the art 
work the artists produce? This would seem 
to be crucial if the scale is to be used for 
purposes of selection or vocational guidance. 
Furthermore, one should reasonably expect 
that artistic discrimination would increase 
with training in art, so that in art classes the 
scores on the scale should be related to how 
advanced the student is in his studies. 

The present investigation was designed to 
provide an answer to such questions. Three 
kinds of subjects were used: art students, 
artists of recognized standing, and nonartists. 
The art students, 44 in number, were majors 
in Art at Tulane University, all of them be- 
ing enrolled in the Newcomb Art School. 
The artist group consisted of eight faculty 
members of the same school; the nonartists 
were faculty members in other departments 
of the university, and were matched with the 
artist group in terms of age and sex. 

Of the 44 art students, 22 were in the first 
year of their work in art at the school, while 
the other 22 were advanced students. For all 
students, ratings of the originality of their 


1 This study was conducted while the author was 
attending Tulane University. 

2 The author wishes to express his appreciation to 
Dr. Frank Barron, the Department of Psychology of 
Tulane University, and to the Newcomb Art School. 


work were obtained from the art faculty. In 
addition, the academic grades of the students 
were available. Approximately half of these 
grades were earned in applied “studio” art 
courses. 

The Barron-Welsh Art Scale was adminis- 
tered individually to each subject. The ad- 
vanced student group earned a mean score of 
40.7 on the scale, while the beginning students 
earned a mean of 39.0. With 42 df, this mean 
difference gives a ¢ of 0.14, not statistically 
significant. The mean score of the estab- 
lished artists was 41.1, which is not signifi- 
cantly different from the averages of the two 
student groups. The nonartists, however, 
earned a mean score of 22.1, 19 points less 
than the score of the artists. The difference 
between the artist and nonartist groups 
yielded a ¢ of 3.04, and is therefore statisti- 
cally significant at better than the .01 level 
of confidence. 

These data agree closely with the observa- 
tions of Barron and Welsh. In their cross 
validation of the scale, they found the mean 
score for artists to be 39.1, and for nonartists 
to be 18.4. 

It is evident that scores on the Art Scale 
do not increase as a function of level of 
training in art, but that persons who are not 
artists at all make significantly lower scores 
than do persons who have sufficient interest 
in art to undertake formal instruction in it. 


Relation of the Scale to Originality 
and to Grades 


Barron (2) had reported correlations rang- 
ing from .30 to .45 between the Art Scale and 
ratings of originality in various samples of 
medical students and of Ph.D. candidates in 
the sciences. No data have been reported, 
however, relating Art Scale scores to origi- 
nality among art students themselves. It 
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might be expected on a priori grounds that 
variation in originality would be more re- 
stricted among artists than among nonartists, 
and of course the same would hold true for 
scores on the Art Scale. Hence not very high 
correlations would be expected within a sam- 
ple of artists. Nevertheless, a significant de- 
gree of correlation should exist if the scale 
has any validity for selection purposes. 

The ratings of originality in the present 
study were obtained by having each faculty 
member at the Newcomb School rate a single 
art product of each of the students on a 5- 
point scale. An average rating for each stu- 
dent was thus obtained, and these ratings 
were then correlated with the Art Scale scores 
of the students. The Pearson correlation co- 
efficient proved to be .40, which with an NV 
of 44 is significant at the .02 level. 

The correlation of the Art Scale with grade- 
point average of the art students was some- 
what lower than this. The Pearson r proved 
to be .34, which is significant at the .05 level. 
Both correlations are within the range of r’s 
previously found in Barron’s investigations, 
and may be taken to be representative of the 


degree of validity of the scale for predicting 
originality and artistic ability. 


Summary 


The Barron-Welsh Figure-Preference Art 
Scale was administered to groups of art stu- 
dents, established artists, and nonartists. 
Both course grades and ratings of originality 
of actual art products were available for the 
art students, and were correlated with scores 
on the Art Scale. The scale scores proved to 
have a significant degree of relationship to 
both rated originality and grades in the stu- 
dent group. Scores did not increase as a 
function of level of training, however, begin- 
ning students scoring the same as advanced 
students, and established faculty artists being 
not significantly different from their students. 
Both artists and art students, however, earned 
much higher scores than did nonartists, a 
finding closely in accord with that reported 
by the authors of the scale. 
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In a previous article in this Journal (1) 
the authors presented the results from the 
first of a series of research evaluations of 
psychological warfare (PW). The purpose 
of these researches was primarily directed 
toward analyzing hypotheses about the funda- 
mental nature of psychological warfare; spe- 
cifically, to determine certain of the anteced- 
ent and attendant psychological factors that 
influence the effectiveness of tactical psycho- 
logical warfare. The first of these studies 
analyzed the effectiveness of psychological 
warfare as employed in Korea. The present 
study is a replication of the first investigation 
with another PW target group, the Commu- 
nist Terrorists (CT) of Malaya. As in the 


previous study, the effects of psychological 


warfare were conceived in terms of psycho- 
logical factors predictable from a knowledge 
of the attitudes, motives, and experiences of 
the recipients. For example, each individual 
in a group has certain attitudes toward the 
actions of his leaders, he feels himself more 
or less a part of his group, he has attitudes 
toward the aims and methods of the struggle 
of which he is a part, he identifies himself to 
varying degrees with communism, and so on. 
In addition, his behavior may be said to be 
determined by a number of situational fac- 
tors, such as the physical environment in 
which he is obliged to operate, his participa- 
tion in “military” operations, and the like. 


1 This report was extracted from a more detailed 
and complete report of the total research project. 
The material has been appreved for publication by 
the British War Office, the United States Depart- 
ment of the Army, and the Operations Research 
Office, The Johns Hopkins University. The views 
herein expressed are those of the authors and do not 
necessarily reflect the opinions of the British Army, 
United States Army, or the Operations Research Of- 
fice. Official clearance of the material in this report 
has precluded presentation of several features of the 
investigation. 


Clearly, the effectiveness of psychological 
warfare in precipitating disaffection and/or 
surrender is a function of many factors that 
shape CT behavior. The extent to which 
psychological warfare can affect a_ target 
group is determined by the nature and ex- 
tent of existing behavior patterns. 

The basic hypothesis of this study was that 
the psychological effects of propaganda of the 
general type employed by the Security Forces 
in Malaya could be predicted, to some extent, 
in terms of the attitudes, motives, and experi- 
ences of the target audience, the CT’s. In 
other words, it would be possible to isolate 
certain factors that had predisposed individu- 
als and groups differentially to increased re- 
ceptiveness for such propaganda, and that 
understanding these factors would enable bet- 
ter prediction and control of behavior of mili- 
tary importance. 


Criteria and Factors Studied 


Adequate criteria for measuring the effects 
of psychological warfare and other factors 
are necessary for the success of operational 
investigations of this order. Two criteria 
seemed appropriate to the physical and mili- 
tary situation in Malaya: (a) degree of dis- 
affection on the part of CT’s prior to their 
being taken prisoner, and (4) surrender-cap- 
ture status. In addition, several other fac- 
tors were included on the basis of their im- 
portance to the primary problem of the study. 
In all, nine factors were studied. 

A. Degree to which the individual was in 
accord with the aims and ideology of the 
MCP (Malayan Communist Party) before 
entry into the jungle. 

B. Degree to which and frequency with 
which the individual experienced fear while 
fighting in the jungle. 
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C. Degree to which the individual felt that 
he was poorly treated by his own forces, e.g., 
reactions to leadership, job assignment, etc. 

D. Degree to which the individual felt the 
effects of jungle living, e.g., physical discom- 
forts, adequacy of supplies and food, natural 
dangers, etc. 

E. Degree to which the individual reacted 
to MCP propaganda. 

F. The amount and intensity of battle ex- 
perience while in the jungle. 

G. The amount and impact of British Se- 
curity Forces information of any kind re- 
ceived by the individual while in the jungle. 

H. Degree of disaffection or change in the 
extent to which the individual was in accord 
with the aims and actions of the MCP (Cri- 
terion). 

I. Surrender-capture 
See Footnote 3.) 


status. (Criterion. 


Sources of the Data ? 
Samples 


The samples of CT’s used in this investiga- 
tion consisted of all SEP (Surrendered Enemy 
Personnel) and CEP (Captured Enemy Per- 
sonnel) available during the period of 1 July 
1953 to 15 February 1954. The total num- 
ber of prisoners sampled was 145, made up 
of 127 SEP and 18 CEP. 

Because of the nature of this study as well 
as conditions peculiar to SEP and CEP proc- 
essing, it was not possible to employ certain 
methods of ensuring representativeness of the 
samples for either the total SEP-CEP popu- 
lations or the ongoing CT population in the 
jungle. However, the belief that the samples 
used here do furnish a basis for the formula- 
tion of important conclusions is warranted by 
the following conditions: The two samples 
each constitute a relatively large part of the 
total SEP-CEP group as of the time of the 
investigation. Also each of the two samples 
included CT’s of different ages, length of time 
in the jungle, rank, education, occupation, 
party membership, sex, etc. These facts 
would seem to indicate that the samples were 
representative of a wide variety of charac- 
teristics that would be expected to obtain in 


2See Footnote 1. 
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the total SEP-CEP populations, as well as in 
the ongoing CT population in the ‘jungle. 

No claim is made that the conclusions 
reached here should be generalized for groups 
that were not represented in the research, 
e.g., civilian sympathizers, etc. 


The Questionnaire 


For each of the first eight factors (A-#), 
a series of questions was developed and mul- 
tiple-choice responses were formulated. The 
possible responses were designed in such man- 
ner as to be scalar in terms of the intensity 
of the experience or attitude being measured. 
Some examples are presented below: 


As When you joined the Communists, to what ex- 
tent did you believe that there could be no other 
political ideology than communism that could give 
happiness to the worker? 


—— (1) I did not believe this at all. 

—— (2) I did not believe one way or the other. 

—— (3) I accepted this as being reasonably true. 
(4) This influenced me greatly to support the 

MCP program. 

G: While you were in the jungle, to what extent did 

you believe the Government messages about good 

treatment given to Communist fighting men who 

gave up? 


—— (4) I thought this a complete lie. 

(3) I had considerable doubts about the truth 
of these messages. 

(2) I thought more and more that perhaps the 
messages were true. 

(1) These messages made me decide to get out 
of the fight. 


As mentioned, the items selected to define 
Factors A-H were scalar in terms of intensity. 
Thus, if. an individual answered each item of, 
say, Factor A with a choice designated “4” 
(maximum intensity), it could be assumed 
that he was more in accord with the war aims 
and ideology of the MCP than an individual 
who responded with all choices number “1” 
or any other patterns of choices among the 
six items of Factor A. On the basis of items 
of this type, a total could be arrived at for 
each factor of each individual’s inventory by 
merely adding together the numbers identi- 
fied with the particular answers he had given. 
This total was taken as the estimate of the 
individual’s status on the scale or dimension 
of behavior, experience, or attitude. 

It will be recalled that the two criteria used 
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for prediction consisted of (a) disaffection, 
and (0) surrender-capture status. A meas- 
ure of the first criterion, disaffection (Factor 
H), was obtained by the scale procedure de- 
scribed above. Surrender-capture differentia- 
tion—in the past, a very difficult problem— 
was effected by accepting the status accorded 
the individual by the British authorities, mili- 
tary, police, or legal. Since the military defi- 
nition of surrender-capture by these authori- 
ties was quite rigorous and corresponded 
quite well with previous research definitions, 
it was accepted without modification. 


Data Collection 


Two thoroughly experienced Chinese-Eng- 
lish speaking interrogators were used in in- 
terviewing the samples of SEP and CEP. 
The two interrogators underwent a training 
period in order to familiarize them with the 
peculiarities of interrogation for the present 
research problem and to pretest the question- 
naire in terms of content, language, and idiom. 

The average time interval between capture 
or surrender and interrogation was three 
weeks. It is important to note that this in- 
terval is much briefer than is characteristic 
of most PW research interviewing in general. 
There is always a concern about the effects 
of the usually long delays in obtaining infor- 
mation such as was sought in the present 
study. A consideration of the administrative 
procedures involved in the handling of SEP- 
CEP indicates that the obtained interview in- 
tervals were at a minimum. For all prison- 
ers, special efforts were made to provide ade- 
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Fic. 1. Average scale scores expressed as deviations 


from neutrality. 


8 Surrender status was given to any CT who vol- 
untarily turned himself over to the military or po- 
lice authorities; capture status was accorded to any 
CT who had to be taken by force of arms. 


L. A. Kahn and T. G. Andrews 


quate physical facilities for the interviewing, 
to help the prisoner to be comfortable during 
his interview and to preclude any factors that 
would tend to bias the obtaining of data. 
Each of the interviews took from one to two 
hours. 


Results 
Comparisons of Scale Responses 


The differences in scale scores obtained for 
the two samples, CEP and SEP, are summa- 
rized in Fig. 1. Presented are the average 
scores for the eight scales expressed as devia- 
tions from the mid-point of each scale’s range 
of scores, that is, as standard scores. The 
mid-point in every case has been made to 
represent the point of neutrality, that is, any 
obtained score that matches the mid-point 
score is to be interpreted as meaning that the 
group is on the average neutral with regard 
to whatever the scale is measuring. 

From Fig. 1 it will be observed that CEP 
mean scores are generally below those of the 
SEP. In terms of direction of responses, i.e., 
position with respect to neutrality, in many 
respects there is some overlap between the 
two samples. For example, SEP and CEP 
average scores deviate to about the same de- 
gree from neutrality on scales A, B, C, and 
E. That is, both sampies before entry into 
the jungle endorsed MCP aims and ideology, 
while in the jungle were not affected by a 
fear syndrome, and were uncertain about the 
extent to which MCP propaganda affected 
them. On the question of adequate treatment 
received while in the jungle (C), both sam- 
ples do not differ significantly. Significant 
differences between CEP and SEP were found 
for Scales D, F, G, and H. On Scale D, for 
example, CEP were rather neutral about the 
physical conditions under which they lived 
while in the jungle. The SEP, in contrast, 
considered life in the jungle intolerable. 

Again, with regard to effects of British 
propaganda, as one might expect, the SEP 
are seen to “be on the fence” if our inter- 
pretation of neutrality in this case is correct; 
the CEP, on the other hand, are seen to ne- 
gate or disbelieve such information and propa- 
ganda. In a similar fashion, interpretation 
of differences between SEP and CEP on Fac- 
tors F and H indicated that CEP, in sharp 
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contrast to SEP, did not recognize the effects 
of their activities against the Security Forces; 
and SEP leaned toward disaffection as a mem- 
ber of the CT, whereas CEP indicated little 
or no signs of disaffection. 

From the data presented in Fig. 1, it is also 
possible to make comparisons among the 
scales. A first fact to be noted is that the 
pattern of results for the two samples is es- 
sentially the same, differences between the 
two samples being expressed in terms of mag- 
nitude of scores that comprise the respective 
patterns of response. For the SEP, physieal 
factors (D) are revealed as having greatest 
strength as a source of disaffection with the 
MCP. As for the CEP, Factor E shows up 
as strongest in terms of revealing disaffection 
with MCP. Since an effort was made to have 
each of the scales composed of questions that 
were somewhat representative of the area to 
which the scales pertained, there are grounds 
for believing that disaffection (now used in 
the sense that these factors underlie any de- 
cision to surrender) is strongest in the areas 
described by Scale D for SEP and Scale E for 
CEP. In terms of those areas studied, where 
the answers revealed least disaffection, Factor 
A (MCP aims and ideology) applies to both 
SEP and CEP. To an even greater extent 
for CEP, Factor F (combat experience) con- 
tributed little. 

At this point, a word of caution should be 
inserted with respect to the interpretation of 
differences just pointed out. Statements made 
to the effect that “disaffection” was strongest 
or weakest in any given area for any given 
group should not be taken to mean that they 
are the most important in terms of determin- 
ing surrender-capture behavior or disaffection 
as defined by our criterion scale (H). It can 
easily be demonstrated that although a given 
target audience may be highly “disaffected” 
with respect to a given factor, it does not 
necessarily follow that it is this factor which 
is most responsible for surrender behavior. 
The problem of prediction of surrender-cap- 
ture behavior and disaffection (H) is dealt 
with in the following section. 


Analysis of Correlations Among Factors 


The analyses reported in the previous sec- 
tion, although describing the characteristics 
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of the scale response distributions for the two 
samples, do not deal directly with the most 
important problem of the present study, 
namely, the interrelations among the various 
factors under investigation and their influ- 
ence upon both surrender-capture and disaf- 
fection behavior, with special attention to the 
role of psychological warfare. For this pur- 
pose the statistical techniques of correlation 
were used. The intercorrelations among the 
nine factors are presented in Table 1. 

Although no a priori cause-and-effect as- 
sumption is made regarding the interpreta- 
tion of these correlations in general, this is 
not regarded as true here with respect to the 
correlations between Factors A to G and Fac- 
tors H and J now considered as the criteria to 
be predicted. Furthermore, there is the logi- 
cal assumption (on which the primary hy- 
pothesis of the study is based) that Factors 
A-G are, in effect, contributors to or predic- 
tors of surrender-capture (J) and/or disaf- 
fection (H) behavior. For example, in re- 
gard to the obtained correlation between 
psychological warfare (G) and _ surrender- 
capture status (J) previous research evidence 
strongly suggests that psychological warfare 
contributes to and directly affects surrender- 
capture behavior. 

The correlation coefficients in rows H and 
I of Table 1 indicate the extent to which dis- 
affection and surrender behavior, respectively, 
relate to the other factors studied in this in- 
vestigation. The factors that correlate sig- 
nificantly with one of these two criteria cor- 
relate in the same direction with the other, 
and the two criterion scales themselves cor- 
relate significantly with one another. This 
expected consistency of results for the two 
criterion scales is an important index of the 
extent to which any conclusions to be drawn 
from each of the two scales can be considered 
reliable. 

Those who before entry into the jungle 
claimed not to endorse MCP aims (A) tended 
to maintain this same nonendorsement of 
MCP propaganda (£) while in the jungle. 
There is a very significant positive relation- 
ship between amount of combat (F), disaf- 
fection (H), combat (F), and surrender (/). 
That is, those who claim to have experienced 
much action against the Security Forces also 
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Table 1 


Obtained Correlations Among Specified Attitudes and Experiences of Malayan CT’s 
(N = 145) 
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Nonendorsement of MCP aims 
Experience of fear 

Inadequate treatment 

Intolerable physical factors 
Nonendorsement of MCP information 
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Affected by SF information 
Disaffection 

Surrender 
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maintained that they became disaffected and 
surrendered willingly. 

The relationship between endorsement of 
MCP information and being affected by Se- 
curity Forces information is just short of sig- 
nificance at the .01 level set for this study. 
The direction and extent of this relationship, 
however, serves as a consistency check on our 
data and analysis. That is, in general, it 
would be somewhat improbable to expect en- 
dorsement of MCP information (£) and at 
the same time the claim to have been affected 
by propaganda in the direction of disaffection 
and surrender. The obtained correlation be- 
tween E and G is at least of sufficient mag- 
nitude to interpret in a comparable manner 
of internal consistency. It should also be 
pointed out that the simple relationship found 
between E and G does not carry through to 
the assertion that rejection of E necessarily 
leads to disaffection and/or surrender. As a 
matter of fact, the correlation between E and 
H, and E and J were found not to be signifi- 
cant. The correlation between amount of 
combat experience (F) and endorsement of 
MCP information and propaganda (E£) was 
found to be significant. That is, those who 
claimed not to have endorsed MCP propa- 
ganda also reported little or no combat ex- 
perience. It is quite probable that most of 
the active fighting is carried out by “hard- 
core” troops, and this is in line with our 
thinking about the structure of the CT forces. 
The experience of fear (B) is seen to be 
highly related to the intolerableness of physi- 
cal factors affecting living in the jungle as a 
CT (D). On the other hand, those who re- 


ported that they experienced fear (B) did 
not relate this factor significantly with the 
amount of combat (F) they had experienced. 
In terms of the experience of fear (B) being 
related to disaffection (H) or surrender (J), 
no significant relationship obtained. As for 
the effect of intolerable physical conditions 
(D) on enemy disaffection (H) and surrender 
(J), significant positive correlations were es- 
tablished. 


It is interesting to note that there is a very 
significant relationship between the inade- 
quacy of treatment received by their own 
forces (C) and nonendorsement of propa- 


ganda from their forces (E). That is, it 
would appear that MCP information and 
propaganda could not overcome what was 
believed to be inadequate treatment. More- 
over, the relation between inadequacy of 
treatment (C) and disaffection behavior (H) 
was very significant, whereas the relation be- 
tween inadequacy of treatment (C) and sur- 
render status (J) was not. This leads to the 
conjecture that inadequacy of treatment (C) 
was only of such severity as to lead to grum- 
bling or a lower level of disaffection, but not 
enough to precipitate surrender, which is of 
course a more severe form of military be- 
havior. 

Of prime interest in this investigation is 
the role of psychological warfare (G) and its 
effect on disaffection (H) and surrender 
status (J). From the results of the correla- 
tional analysis, it was observed that there is 
a very high correlation between British propa- 
ganda (G) and surrender status (J), and a 
substantial correlation also with disaffection 
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(H). In terms of the basic hypothesis of 
this study these results, at this stage of the 
analysis, should be noted as quite important. 
A further observation has to do with the sig- 
nificant correlation found between PW propa- 
ganda (G) and inadequate treatment (C), a 
result which would tend to confirm the no- 
tion that receptivity to enemy propaganda 
may be conditioned by the preparatory ef- 
fects of other experiential factors. 

We have briefly discussed some of the more 
important simple relationships that were found 
among the factors studied. Since the major 
task of this investigation was to determine 
the effect of several factors acting jointly to 
influence disaffection and surrender behavior, 
overemphasis of results from the foregoing 
analysis should not be attempted. A simple 
correlation does not necessarily indicate the 
unique relation between two variables when 
there is evidence that such variables also cor- 
relate well with one or more other factors, as 
in the present case. The statistical technique 
used to determine the correlation of two fac- 
tors in independence of the influence of an- 
other single factor is, of course, partial cor- 
relation. Reference to Table 1 shows that it 
would be quite possible to establish a large 
number of partial correlation coefficients, 
each of which would define the new order of 
relationship between the factors concerned 
when the mutual influence of a third factor 
is removed. It is not certain, however, 
whether such a protracted analysis would get 
at the fundamental problem of the investiga- 
tion: “What is the unique contribution of 
each of a combination of factors acting jointly 
to affect disaffection and/or surrender behav- 
ior?” The method of analysis used to derive 
answers to this problem was multiple regres- 
sion. 

It will be recalled that two criteria factors 
were delineated for prediction: disaffection 
behavior (#7) and surrender-capture status 
(7). All factors were combined and corre- 
lated first against the criterion factor of sur- 
render status (J); the multiple-correlation 
coefficient was found to be .695. Consid- 
ering the types of factors used to predict 
surrender status (/), the obtained multiple- 
correlation coefficient is very high and statisti- 
cally significant. It should also be noted that 
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this coefficient contains the contribution of 
the other of the two criteria (#7), and to that 
extent it is somewhat inflated. 

If on the basis of re-examination of Table 
1 we combine into one system all those fac- 
tors that bear a significant relation* to the 
criterion factor of surrender status (/), the 
multiple correlation is reduced to .649, which 
is not significantly different from the origi- 
nal .695. This result points up the fact that 
knowledge about MCP aims (A), fears (B), 
and MCP information and propaganda (£), 
the factors excluded, does not contribute ma- 
terially to the prediction of surrender-capture. 

Using disaffection (#) as the criterion and 
the full combination of factors (excepting 
surrender), the multiple-correlation coeffi- 
cient is found to be .576, which is highly sig- 
nificant. If, as before, we combine into one 
system only those factors (C, D, F, G) that 
bear a significant relation to the criterion of 
disaffection (#7), the multiple-correlation co- 
efficient is reduced to .520, which is not 
markedly reduced below the original .576. 
Again, we are prepared to conclude that the 
prediction of disaffection is equally efficient 
as a result of using only four measurable fac- 
tors (C, D, F, G) as compared with the use 
of the complete battery of factors. 

The extent to which each of the factors 
acts as a unique determiner of disaffection 
and/or surrender behavior is given by the 
beta weights in a given multiple regression 
equation. Considering first the relative in- 
dependent contribution of factors C, D, F, G, 
and H to the prediction of surrender-capture 
(7), we obtain: 


I’ = — .114C’ + .258D’ 
+ .240F’ + .393G + .156H’. 


From the above equation, it should be noted 
that British propaganda (G) carries the larg- 
est relative weight, which is statistically very 
significant. With respect to the basic hy- 
pothesis of this investigation, this result per- 
mits the assertion that psychological warfare 
does serve independently to influence sur- 
render behavior. Of lesser relative import- 
ance, but still statistically significant, is dis- 
affection (#7) as a factor affecting surrender. 
Inadequate treatment (C) is shown to be of 


4 The level of significance set was P= < .01, or a 
correlation of .21 or higher. 
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no importance in this connection because it 
is not statistically significant. 

Turning now to a consideration of the 
second criterion used, disaffection (H), it 
should be recalled that the combination of 
factors C, D, F, and G yielded a multiple- 
correlation coefficient of .520. Computing 
the regression equation, the relative inde- 
/pendent contribution of each of the factors 
is shown in the equation below: 


H’ = .239C’ + .151D’ + .156F’ + .280G’. 


From this equation, it is noted that once more 
British propaganda (G) carries the highest 
weight among the predictor factors, although 
it is slightly less than for the criterion of sur- 
render. Whereas treatment (C) in the pre- 
dictor equation for surrender was negative 
and nonsignificant, in the present context this 
factor is of almost equal importance with the 
role of British propaganda (G). Again, the 
effects of both physical factors (D) and com- 
bat experience (F) are seen to be about equal 
in relative importance in influencing disaf- 
fection. 

The significance of the above findings rests 


mainly in the support they give to the basic 
hypothesis of the present study. Thus it has 
been demonstrated that psychological war- 
fare does have an independent influence on 


disaffection and on surrender behavior. Also 
it is observed again that other experiential 
factors appear to act in a way to increase 
the susceptibility to psychological warfare. 
These results must also be interpreted as 
meaning that psychological warfare alone 
does not produce a sufficient increase in dis- 
affection or tendency toward surrender un- 
less the target group is already predisposed 
in terms of other factors such as those studied 
here. 

In the first reported investigation of this 
type (1) it was observed that there was need 
for replication and cross validation before 
adequate and scientifically meaningful gen- 
eralizations could be made from the results 
obtained. It is therefore an important fea- 
ture of the present study that it not only 
serves as independent replication and cor- 
roboration of the previously obtained results, 
but also that the results herein reported take 
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on added importance and prediction value 
because of the stability shown for such results 
over a wide range of national and military 
conditions. There is nevertheless continuing 
need for carefully designed field research on 
the effectiveness of tactical and also strategic 
psychological warfare under a variety of 
types and conditions and with respect to 
other determining and attendant factors. 


Summary 


This study is a replication with another 
psychological warfare target group, the Com- 
munist Terrorists of Malaya, of a first in- 
vestigation, reported in this Journal. 

Standardized interviews on Malayan Com- 
munist Terrorist prisoners of war were car- 
ried out to test the relative importance of 
several attitudes and experiences in determin- 
ing extent of disaffection on the part of cap- 
tive troops and their willingness to surrender 
peacefully at the time they were taken pris- 
oner. Among the experiences assessed was 
the factor of psychological warfare to which 
the communist terrorists were exposed prior 
to becoming prisoners of war. ‘Scores’ on 
each factor and experience were derived, as 
well as on the two criteria of disaffection and 
willingness to surrender. 

The primary results are presented in a cor- 
relation matrix, which is analyzed for certain 
relations and with respect to the general hy- 
pothesis that psychological warfare is effec- 
tive in changing behavior, but its effects are 
mainly of a precipitating nature that is dif- 
ferential for persons more sensitized to it by 
their morale and experiences. 

The primary correlations, multiple correla- 
tions, and standard multiple regression coeffi- 
cients were analyzed and appeared to cor- 
roborate the major hypothesis. Additional 
relationships of possible military and social 
importance are deducible from the data ob- 
tained. 
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Surveys of consumer preference form an 
extensive part of business procedure in the 
American economy. These surveys are used 
as a basis for decisions concerning produc- 
tion, pricing, packaging and advertising. The 
approach followed by management typically 
has been to employ personal judgment and 
experience in interpreting consumer studies, 
rather than to develop explicit mathematical 
lines of solution. 

In the long run, under stable conditions in 
a competitive market, the superiority of some 
commodities over others demonstrates what 
the consumer prefers to buy and what prices 
he is willing to pay. In the short run, tech- 
nological innovations and changes in purchas- 
ing power necessitate the prediction of con- 
sumer behavior in advance of manufacturing 
and marketing decisions. 

The purpose of this paper is to define and 
test a research model in terms of which con- 
sumer preference studies can provide answers 
to questions of product planning and market- 
ing. The following questions are ones upon 
which such a model focuses. 

1. What is the optimum level of general 
quality at which a commodity shall be pro- 
duced and sold for maximum consumer pref- 
erence? For example, would the consumer 
prefer to pay for a better quality toothpaste 
at a higher price, or a less expensive tooth- 
paste at a lower price? 

2. What combination of qualities in a com- 
modity is of maximum preference to the 
consumer? For example, assuming a fixed 
amount of money to be available for manu- 
facturing a toothpaste at a particular price 
level, how shall this money be apportioned 
among the various qualities produced, such as 
taste, cleansing efficiency, and foaminess, so 
that the consumer prefers the resulting tooth- 


1 The formulation and testing of this model were 
discussed with Harold Gulliksen and Ledyard Tucker 
of Princeton University, and the author acknowl- 
edges his appreciation for suggestions received. 


paste to one with any other combination of 
qualities at the same price? 

3. How much shall the manufacturer spend 
upon packaging, advertising, or selling in order 
to create the most demand for his product? 
On what types of advertising shall he spend 
his money in order to create maximum pref- 
erence for the toothpaste which he sells? 


Assumptions of a Marginal Preference Model 


We next review some of the assumptions of 
a model upon which answers to these ques- 
tions may be predicated. 

1. The preference for one commodity com- 
pared qualitatively with another can be meas- 
ured in standard error units, and such meas- 
urements are additive irrespective of the way 
in which the qualitative intervals between 
commodities are divided during the measur- 
ing process. 

Techniques for measuring preferences be- 
tween stimuli in terms of a standard error of 
judgment, currently described in (3, 4), have 
been available for consumer research for two 
decades. These techniques do not constitute 
measurement in the sense of physically di- 
viding a unit of measure into a quantity a 
number of times. As pointed out by Gullik- 
sen, they assign to items from a qualitatively 
ordered series numerical values which con- 
form to the additive property of physical 
measurement (5). 

Quantitative designations are given to pref- 
erences which satisfy the condition that the 
numerical size of the interval between quali- 
ties A and C measured in terms of preference 
equals the sum of the preferences between A 
and B and between B and C, irrespective of 
what qualitative points are taken as A, B, 
and C. The determination of scale values 
and testing their consistency depend upon the 
additive character of scale values. 

Two techniques which use a standard error 
of judgment as a metric unit are the method 
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of paired comparisons and the method of suc- 
cessive intervals (2, 3, 4, 6, 7, 8, 9, 10, 12). 
These methods employ data of the same form 
already collected in consumer research, data 
which are obtained when products are pre- 
sented as stimuli in pairs for choice by the 
consumer or as stimuli to be placed in cate- 
gories such as “like,” “dislike,” or “indiffer- 
ent.” 

Such data are typically left by market 
researchers in the form of frequencies of 
persons expressing particular preferences or 
opinions. The conversion of frequencies into 
preference measurements requires a moderate 
degree of statistical labor and also attention 
to such conditions as the homogeneity of the 
group measured and the distribution of dis- 
criminal or preferential errors among indi- 
viduals in the group.* The reluctance of 
market researchers to make the conversion to 
a more precise basis for analysis may be at- 
tributed to the need for indicating how pref- 
erence measurements can be used to solve 
management problems. 

2. The consumer seeks to spend his money 
in such a way as to maximize his preference 
for the commodities or particular qualities in 
commodities that he buys. This assumption, 
similar to that of marginal utility analysis 
(11), is formulated in terms of a measurable 
variable of consumer behavior. 

3. Consumer preference for a commodity is 
at a maximum at the points of cost outlay for 
qualities by the manufacturer at which the 
marginal preferences for qualities are equal. 

In developing this principle for changes in 
qualities of goods, we proceed along lines 
parallel to theory available in quantitative 
economics. Instead of considering quantities 
of different qualitative goods bought by the 
consumer, we assume quantities consumed to 
be invariant and consider degrees of qualities 
in the goods bought by the consumer. 

We assume a situation of constant price at 
which the commodity may be sold, in which 
case the total outlay by the manufacturer for 

2The measurement of preference is to be differ- 
entiated from measurement in terms of perceptual 
discrimination. Ledyard Tucker defines the measur- 
ing unit in the study of preference as the “prefer- 
ential error” rather than the “discriminal error.” 


This unit is the standard error with which the indi- 
vidual expresses his preferences between qualities. 
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qualities in the commodity is fixed, one 
quality being improvable only at the expense 
of another quality. 

Preference for the commodity is a function 
of the amounts of money spent by the manu- 
facturer in producing its qualities. This func- 
tion is represented by 


Y= 1(®%, X2, didn, X;), (1) 


where Y is the preference measurement for 
the commodity, and X,, Xo,...X; are 
amounts of money expended in producing the 
qualities. 

Y is maximized under the condition that 


Xit Xet---+X;:;-—K=0, (2) 


where K is the fixed price per unit available 
for costs of production and distribution. 

The equations obtained by partial differ- 
entiation under the restriction imposed upon 
Y are 


(3) 


where 


@$=Xit Xe+---+X:i—K, (4) 


and A is a Lagrange multiplier. 
Solving these equations we obtain the rela- 
tionship 


oY oY ‘4 P 
. Oe es. (5) 
OX, OX, dX; 

Each of these derivatives is a marginal pref- 
erence, which is the ratio of an increment in 
preference to the increment in cost upon 

which preference changes depend. 

Since 

oY 


aX, 
aV _ af(Xr, Xx +++ XV) 
aX, aX» ‘ 
OY — af(X1, Xo, «++ Xi) 
ax; aX; : 


af(X, Xs, 


coe Xi) 
Ox, , 
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we may solve these equations for the X values 
in terms of the equal marginal preferences. 
Letting M represent the marginal preference, 
we have the following equations which define 
the X values: 


Xi = ¢o:(M), (7) 
Xe = ¢2(M), 
X; _ ¢:(M). 

Substituting these functions for the X values 
in terms of M in the constant cost equation 
permits M to be determined. 
$:1(M) + ¢.(M)+ ---+¢(M)—K=0. (8) 


The value thereby obtained for M is utilized 
with equations (7) to find the cost points for 
the qualities which maximize consumer pref- 
erence. These points provide the optimum 


allocation of revenue among the various quali- 
ties produced in the commodity. 

The question of what levels of general 
quality and corresponding price are most pre- 
ferred by the consumer in the commodities 
bought by him arises in a situation of flexible 
prices at which commodities can be produced 


and sold. Since the consumer has a pre- 
scribed amount of money available for pur- 
chases among various commodities, the prob- 
lem of maximizing preference is formally the 
same as that of determining optimum costs 
for qualities within a single commodity at a 
fixed price. Commodities are produced and 
sold at the most preferred levels of quality 
and price if the marginal preferences of con- 
sumers for the commodities are equal. 


A Test of the Marginal Preference Model 


The empirical solution to the problem of 
optimum levels of quality and price for com- 
modities or to the problem of optimum alloca- 
tion of costs in producing qualities requires 
the establishment of preference-cost functions. 
The preferences of consumers for different 
cost items are measured, and an appropriate 
function fitted defining preference in terms of 
cost variables. 

In order to make an empirical test of the 
marginal preference model, a simple study 
was performed by means of a food question- 
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naire.* The meal which a consumer buys can 
be regarded as a single article of which vari- 
ous dishes on the menu are qualities, or the 
meal can be considered to consist of an array 
of commodities bought by the consumer under 
the condition of having a certain amount of 
money to spend. 

A menu list was prepared in consultation 
with the manager of a restaurant serving 
home-style food. A list of 10 appetizers, a 
list of 10 entrees, and a list of 10 desserts 
were included in the menu with prices of the 
items. Students from a dining hall were 
asked to mark, as if free guests, the rank 
order of their preferences from one to 10, for 
appetizers, for entrees, and for desserts. 
They were then asked to select a meal costing 
from $1.80 to $2.00, including an entree and 
a dessert, and an appetizer if desired. 

The rank orders reported on the 263 com- 
pleted schedules are given in Table 1. Guil- 
ford’s composite standard procedure (4, pp. 
186-188), a modification of the method of 
paired comparisons, was used to measure the 
preference differences between appetizers, be- 
tween entrees, and between desserts, the scale 
values being given in Table 1. 

Each of the three categories of foods was 
treated as a qualitative component of the 
meal, and an empirical function established 
relating preference for the item to the cost of 
the item to the consumer within each of the 
three kinds of food. The assumption was 
made that the preference-cost functions of 
appetizers, of entrees, and of desserts are in- 
dependent of each other, so that the total 
preference for a meal is the sum of the pref- 
erences for the three parts of the meal. In 
this case the partial derivatives of the total 
preference-cost function are the same as the 
derivatives of the three separate functions. 

The preference-cost curves and the plots of 
data from which the curves were established 
are shown in Fig. 1. The formulas for the 
three curves, fitted by least squares, are, let- 
ting X,, X,, and Xq be prices of appetizers, 
entrees, and desserts, and Y, the preference 


3 This test and the function used for fitting pref- 
erence-cost curves were proposed by Ledyard Tucker, 
who has been conducting factor analysis of food 
preferences. 
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Table 1 


Frequencies with which Appetizers, Entrees, and Desserts Were Ranked in 10 Positions by 263 Students, 
and Scale Values Computed by the Composite Standard Method of Guilford 








Menu Item with Price 


Scale 
Value 





Appetizers : 
Fruit compote, 35¢ 
Shrimp cocktail, 40¢ 
Grapefruit juice with sherbet, 30¢ 
Canned fruit cocktail, 15¢ 
Tomato juice, 10¢ 
Vegetahie soup, 20¢ 
Turkey coup, 25¢ 
Mushroom soup, 35¢ 
Pea soup, 15¢ 
No appetizer ordered 
Entrees: 
Roast beef, $1.75 
Baked ham, $1.35 
Veal cutlet, $1.40 
Roast duck, $1.50 
Chopped steak, $1.20 
Chicken pie, $1.30 
Broiled swordfish, $1.60 
Turkey croquette, $1.10 
Filet of flounder, $1.25 
Baked sea bass, $1.15 
Desserts: 
Strawberry shortcake, 45¢ 
Pie, 30¢ 
Ice cream, 25¢ 
Sundae, 35¢ 
Baked apple, 30¢ 
Pineapple turnover, 40¢ 
Cake, 40¢ 
Brown Betty, 20¢ 
Cooked peaches, 20¢ 
Vanilla pudding, 15¢ 


6 1.66 
11 1.60 
10 1.34 
15 1.33 
27 
33 
51 
40 
48 
22 


3 
10 
14 11 
23 12 
20 16 
24 26 
44 54 
29 42 
51 39 
45 43 


2 3 
5 1 
10 6 
14 9 
21 9 
27 20 
21 26 
63 43 
54 65 
46 71 





for the meal, = Y, + Y,, + Ya, 

Yo= .053— 1.802 X.+ 3.196¥ Xa, (9) 
Y.= 3.240+ 4.996 X¥.— 8.136vX., (10) 
Ya= —5.024—13.363 Xa+18.099y Xa. (11) 


In fitting these curves the choice of origin is 
a matter of convenience. For each of the 
three curves fitted, the food item whose pref- 
erence was lowest on the list of 10 was used 
as the zero point in measuring preferences 
within its group. 

The preference-cost curves for appetizers 
and desserts follow the expected principle of 


diminishing returns, while the curve fitted for 
entrees is approximately linear. The failure 
of entrees to manifést diminishing returns 
may be attributed to the possibly smaller de- 
gree to which this principle might apply to 
entrees, to the failure to include data from 
lower portions of the curve which might give 
it proper curvature, to sampling error in- 
volved in the original 10 entrees taken as 
representative, and to possibly unrealistic 
price estimates. 

After determining the preference-cost func- 
tions, their derivatives were taken, costs for 
the three classes of food were expressed as 
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Appetizers 


i A 





Entrees 





SCALE VALUE OF PREFERENCE FOR MENU ITEM 


Desserts 








«40 -80 


i 
1.00 


PRICE OF MENU ITEM 


Fic. 1. 


functions of marginal preference, and substi- 
tutions made in the constant cost equation. 
The average expenditure for the meal made 
by the respondents was $1.93, which was 
taken as the constant cost within which se- 
lections were made from the menu. 

The price points at which it was predicted 
that subjects would order their appetizers, 
entrees, and desserts are $.24, $1.32 and $.37, 
respectively. The empirically observed av- 
erage prices of items ordered in these three 
categories are $.22, $1.41, and $.30. These 
findings are an approximate fit in an ex- 
ploratory test. 

In inquiring why the fit is not closer, sev- 
eral other circumstances may be noted. (a) 
Appetizers, entrees, and desserts are not uni- 
dimensional qualities, and consumer prefer- 
ences along these dimensions are only ap- 
proximately scalable. (b) The subjects are 
probably not homogeneous as to their pref- 


Preference-cost curves for appetizers, entrees, and desserts. 


erence-cost functions for foods, although the 
analysis treats them as if they were. The 
establishment of preference-cost functions in 
more accurate research is facilitated by segre- 
gation of subjects into homogeneous sub- 
groups before carrying through the analysis 
of their marginal preferences. 

(c) When entrees are ordered, larger quan- 
tities of food are involved, which may not be 
taken into account sufficiently by taste pref- 
erences. If attention is restricted to appe- 
tizers and desserts alone, these involving more 
nearly comparable quantities of food, the 
prices predicted under the constant cost re- 
striction of $.52 for the two items are $.18 
and $.34. 

In the empirical test of food preferences 
which has been described, there already ex- 
isted a variety of food items in use at various 
quality and price levels. Buying behavior 
with respect to these was used as a test of 
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the validity of the prediction. In the context 
of marketing problems, there may not yet 
exist commodities with new qualities, and it 
is then desired to specify what the charac- 
teristics of these commodities should be for 
maximum consumer preference. The investi- 
gation of preferences for prevailing or ex- 
perimental products permits preference-cost 
curves to be defined. The model can be used 
to predict what the point of maximum pref- 
erence would be, and the extent to which such 
predictions can later be verified by gaining 
consumers in competition provides the ulti- 
mate test of the model. If the model is not 
fully applicable, it articulates working hy- 
potheses in terms of which further qualifica- 
tions can be formulated in order to apply the 
model to the prediction of consumer behavior 
toward new products. 

Among additional qualifications, there may 
be noted the role of price as a psychological 
attribute of an article. The marginal prefer- 
ence for price is determined by observing how 
much the preference for a commodity changes 
when the price tag is altered, while its other 
qualities remain unchanged. 

Also of consequence: it may be uneconomic 
to produce a separate commodity suited to 
each small homogeneous group of consumers. 
Which forms of a commodity it is profitable 
or socially useful to produce is a question re- 
quiring study. An approach to problems of 
qualitative economics has recently been pub- 
lished by Abbott (1). 


Other Answers Indicated by the Marginal 
Preference Model 


1. The problem of the amounts of money 
to be spent upon packaging, advertising, and 
selling may be treated as if each of these fac- 
tors were a qualitative characteristic of the 
commodity. The amount spent per produc- 
tion unit for each of these factors, if optimum 
preference for the product is achieved, is 
given by the relationship of equal marginal 
preferences under the condition of fixed total 
cost of production and distribution. 

It is not empirically feasible for the con- 
sumer to express preferences between a com- 
modity before a prescribed amount of adver- 
tising has been used and the same commodity 


Purnell H. Benson 


afterwards. Instead, the method of succes- 
sive intervals may be used, or the commodity 
at two different times may be compared with 
one or more other commodities employed as 
standards of comparison. 

The resulting preference-cost curves are the 
same in nature as those already described. 
Money is profitably expended upon advertis- 
ing of various types up to the point where the 
marginal preferences produced by these types 
are equal to each other and to the marginal 
preferences for manufactured qualities. 

Preference curves for selling outlay can be 
established by studying the preference of buy- 
ers before and after selling activities have 
been carried on. The outlay for selling ac- 
tivities is determined at the point where the 
marginal preference, with respect to selling 
cost, is equal to other marginal preferences. 

2. Where qualitative changes can be made 
in a commodity at no increase in manufactur- 
ing cost, as in the choice between vanilla and 
mint flavors or between various combinations 
of these in a dentifrice, we are not concerned 
with marginal preference with respect to cost 
under cost restrictions. The problem is a 
simpler one of determining the point of 
qualitative mixture of flavors at which pref- 
erence for the dentifrice is maximized for the 
homogeneous subgroup of consumers studied. 
The partial derivatives of preference with re- 
spect to qualities defined by laboratory rather 
than cost measurements are set equal to zero, 
and the required fractions of flavoring thereby 
specified. 


Summary 


A research model for the analysis and pre- 
diction of consumer behavior is proposed. 
This model is an extension of marginal utility 
principles in that preference, a measurable 
variable, is employed in place of utility, and 
in that the principle of maximization of pref- 
erence is extended to qualitative degrees as 
well as quantities of commodities. 

In an exploratory test using food prefer- 
ence data, the average prices of appetizers, of 
entrees, and of desserts ordered by 263 per- 
sons were predicted with a mean error of six 
cents. 

This paper has sought to call attention to 
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economic implications of preference measure- 
ments of data of a form commonly collected 
in consumer studies, implications which ap- 
parently merit further empirical and theo- 
retical examination. 
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Test Validity Over a Seventeen-Year Period 


E. B. Knauft 
Etna Life Affiliated Companies, Hartford, Connecticut 


A 15-minute general mental ability test, 
known as LOMA-1, has been administered to 
applicants in the home office of the A°tna Life 
Affiliated Companies since 1937. The most 
recent validity study of this test, although 
generally confirming previous studies, was felt 
to be of interest because it covers a period of 
17 years. 

The principal finding concerns the relation- 
ship between test score and job level the em- 
ployee has attained over a period of years. 
The analysis was based on 692 persons hired 
between 1937 and 1949 and still employed by 
the company on March 1, 1954. These per- 
sons were tested at the time they were em- 
ployed. A product-moment correlation of 
+ .60 was obtained between LOMA test score 
and present job classification of the 692 em- 
ployees. The classification system for these 


jobs includes seven grades or classes ranging 
from simple clerical jobs such as file clerk to 
complicated decision-making jobs such as 
senior underwriter and senior claim examiner. 
The great majority of these employees started 
in one of the bottom three classes. The com- 
pany adheres to a policy of promotion from 
within, and it is very unusual for anyone to 
be initially employed on a job above the third 
class from the bottom. The correlation of 
+ .60 thus indicates that the LOMA test 
score is a fairly good predictor of the extent 
to which an employee will be promoted over 
a period of years. 

Table 1 summarizes the relationship be- 
tween job class attained and LOMA score. 
The number and percentage of persons falling 
in each job class are given for each of three 
score categories. 


Table 1 
Relationship Between Job Class and LOMA Score 








LOMA Score 





100-119 


120 and Over 





Job Class 


N % N % 





Decision-making 3 
Decision-making 2 
Decision-making 1 
Complicated clerical 2 
Complicated clerical 1 
Simple clerical 2 
Simple clerical 1 


Total 


1 1 3 
8 5 25 
23 13 26 
59 34 24 
56 32 16 
23 13 5 
3 2 1 


173 100 100 





A second criterion for evaluating the effec- 
tiveness of the LOMA test is the individual’s 
present job performance. In several depart- 
ments production records were available for 
simple and complicated clerical jobs which 


are on a wage incentive plan. The criterion 
used was the mean of 14-weeks bonus effi- 
ciencies of employees who had been on the 
job long enough to be producing at a rela- 
tively constant rate. The bonus efficiency 
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Table 2 


Relationship Between LOMA Score and Production 








Criterion 
Reliability 


Depart- No. 
ment Cases 


Validity 
Coefficient 





36 92 .40* 
23 91 A8* 
14 94 .29 
19 87 32 





* Significant at 5% level of confidence. 


represents net production with allowances for 
time on “non-bonus” activities. Results ob- 
tained from four departments used in this 
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study are summarized in Table 2. Criterion 
reliabilities are based on a correlation between 
odd and even weeks for 16 weeks. Validity 
coefficients are product-moment correlations 
between the production criterion and LOMA 
score. 

Although the LOMA test appears to be 
most effective as an aid to prediction of job 
class eventually attained, indications from 
four small samples suggest that at least in 
some instances it is also effective as a partial 
predictor of performance on various kinds of 
clerical jobs. 


Received January 3, 1955. 
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Lindzey, G. (Ed.). Handbook of social psy- 
chology. Cambridge, Mass.: Addison-Wes- 
ley Publishing Co., 1954. Vol. I: Theory 
and method. Pp. 1-588. Vol. II: Special 
fields and applications. Pp. 601-1226. 
$15.00 ($8.50 for one volume). 

This book is a major attempt to present, 
summarized in handbook fashion, what is 
known theoretically, methodologically, and 
substantively in the area of social psychology. 
The various chapters include contributions 
by psychologists, sociologists, anthropologists, 
and statisticians. Most of the chapters are 
written carefully and thoughtfully. It is a 
good and worth-while book to have in print. 
Many students and research workers will 
have occasion to refer to it. 

Since a brief review cannot hope to deal 
with the details of what is in such a two vol- 
ume work, I will confine myself entirely to 
commenting on the relation between this 
handbook and the field of social psychology 
as this reviewer sees it. When I first saw the 
two volumes, I was astounded that it took 


1,226 large double-column pages to summarize 


what is known on the subject. After having 
worked in this field for about ten years, I 
have had the somewhat disillusioned impres- 
sion that what is really known could be sum- 
marized in about 100 pages. Quadrupling 
this figure, to take account of my own biased 
and skeptical nature, still leaves quite a dis- 
crepancy in length. I was consequently 
highly motivated to see how so many pages 
were used. 

The section of the book on “Contemporary 
Systematic Positions” contains five chapters. 
Three of these, dealing with reinforcement 
theory, cognitive theory, and psychoanalytic 
theory really have little relevance to social 
psychology. They summarize theories about 
other problem areas in psychology, with oc- 
casional strained reference to the social field. 
These chapters, with relatively few changes, 
would be equally relevant (or equally irrele- 
vant) to a handbook of child psychology or 
of industrial psychology. This is equally 
true throughout more than half of the fourth 
chapter that deals with field theory. The 
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problem is, of course, that there is very little 
theory directly pertinent to social psychology. 
If we want to look at it optimistically, we 
might say that the editor, by including this 
section, expresses his hope that in the future 
these theories will be relevant. 

The first three chapters of the next section 
on “Research Methods” are similarly general. 
One on design of experiments and another on 
statistical methods, although well written, are 
not particularly oriented toward the problems 
of research in the social field. The third 
chapter entitled “Attitude Measurement” 
sounds relevant but it might more properly 
have been entitled “The Logic of Scale Con- 
struction.” 

But this rather tangential relevance to so- 
cial psychology holds only in portions of the 
first volume. The second volume deals di- 
rectly with the area. It is clear, however, 
that social psychology is defined very broadly 
by the editor, much more broadly than many 
persons would define it. Social psychology, 
apparently, runs the gamut from chapters on 
psycholinguistics and on humor to chapters 
on national character and on culture. With 
this broad definition of the area it is indeed 
difficult to keep anything out. For example, 
the first chapter in this second volume con- 
tains 15 references to studies in experimental 
and physiological psychology. 

It seems to me that the editor of this hand- 
book has expressed a certain philosophy about 
social psychology with which I heartily agree. 
He seems to feel that its province cannot 
really be separated from either the province 
of psychology or of sociology and anthro- 
pology. He seems to feel that theories of 
psychology or of sociology ought to be, al- 
though they are not yet, equally theories of 
social psychology. The behavior and func- 
tioning of the human doesn’t follow different 
laws depending upon whether or not he is in 
the company of, or is oriented toward, other 
humans. 

But with such an orientation, no matter 
how wholesome it may be, the job of select- 
ing material for inclusion in a handbook of 
social psychology is difficult. It also may be 
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that the selection of material and the organi- 
zation of the volumes was heavily influenced 
by the particular interests at Harvard Uni- 
versity. It is true that out of 30 chapters, 
almost half are written by persons who were 
at Harvard when the volumes were planned. 


Leon Festinger 
Stanford University 


Thrall, R. M., Coombs, C. H., and Davis, 
R. L. (Eds.). Decision processes. New 
York: John Wiley & Sons, 1954. Pp. viii 
+ 332. $5.00. 

This volume is a collection of papers from 
the University of Michigan summer seminar 
at Santa Monica, California, in 1952. This 
seminar, devoted to decision processes, had 
as participants mathematicians, statisticians, 
psychologists, economists, and philosophers. 
The papers show this variety of background 
and interest. 

The center of all decision-making is a pay- 
off table, such as the following. 


Column No. 

Possible 
decisions 1 F 3 

A Ua Use Uss 

B Up Ups Ups 

4 Vey Uce Ues 


The columns may have different meanings. 
In games against nature, each column is a 
possible state of nature and each U repre- 
sents the outcome of a certain decision if a 


certain state of the world exists. In social 
choice, the columns represent individuals and 
each U is the desirability to a particular indi- 
vidual of a certain social decision. In some 
choice situations, the columns might be par- 
ticular qualities of each action, and each U 
a measurement. (A, B, and C might be cars 
one is considering buying and the columns 
represent horsepower, size, and cost.) Some 
method is required for choosing between ac- 
tions if no decision is best in each column. 
For social choice, two principal methods 
exist: majority rule, or some variation of it, 
or dictatorship. In the former, each indi- 
vidual chooses the best decision according to 
his column and the society chooses the row 
favored by the most individuals. In a dic- 
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tatorship, decisions are made in the first 
column and the others ignored. 

For the game against nature, the best 
method is the Bayes solution. In this method, 
one multiplies each column by the prob- 
ability that it will occur, adds the rows, and 
chooses the best. This method cannot be 
used, however, if these probabilities cannot 
be determined or if the values cannot be 
stated in a metric which permits such multi- 
plication and addition. To cover these cases, 
various other decision rules have been pro- 
posed. 

Most of Part I of this volume is devoted to 
a discussion of these alternative criteria. The 
importance of having such criteria is inad- 
vertently demonstrated in the discussion by 
Roy Radner and Jacob Marschak. To dem- 
onstrate the undesirable features of two pro- 
posed criteria, they construct a game and in- 
dicate that the results which these rules give 
do not conform to the common-sense solution. 
Their “common-sense solution” is not met by 
any criteria proposed in the book and seems 
to be reasonable only if one applies certain 
probabilities to the Bayes solution, but if 
these probabilities are known, no other cri- 
teria are necessary. John Milnor also dis- 
cusses various criteria and comes to the 
conclusion that none of them have all the 
desired characteristics. He then proves that 
a suitable criterion exists, and suggests that it 
is too complicated to be workable. Whether 
this result is encouraging or discouraging is 
not clear. 

Part II of this volume is concerned with 
learning theory. Robert R. Bush, Frederick 
Mosteller, and Gerald L. Thompson give a 
generalization of the Bush-Mosteller learning- 
theory models and Merrill M. Flood gives 
certain results of Stat-rat (or Monte Carlo) 
learning problems for certain games. I am 
not sufficiently familiar with learning experi- 
ments to comment on this model. It is inter- 
esting to note that, as a decision process, the 
criterion is essentially Bayes, and the prob- 
lem becomes one of estimating the appro- 
priate probabilities. 

Part III is concerned with the utility func- 
tion. These studies are concerned with the 
logical structure of a utility function, whether 
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the ordering is complete or partial, and how 
detailed the metric can be which is erected 
upon .this base. This entire section suffers 
from the nature of the discussion, which is 
mainly axiomatic. It is clear that a set of 
axioms can be drawn that will give the utility 
function any desired set of properties. The 
interesting questions do not fit into this 
framework at all, but hinge around the de- 
sign of experiments to test whether a given 
axiomatic system actually fits individuals’ 
preference patterns. These articles present 
only the results of introspection, although the 
article by C. H. Coombs and David Beards- 
lee in Part IV uses the novel approach of re- 
porting the introspection of a secretary rather 
than the authors. It is true that it is neces- 
sary to develop an axiomatic system before 
testing, but the Mosteller-Nogee experiments 
seem rather lonely amid all the theoretical 
discussions of utility. 

The article by Jacob Marschak contains, 
in addition to a highly controversial theory 
of utility, a very interesting analysis of cer- 
tain problems of information and organiza- 
tion. To an economist, this is perhaps the 
most interesting section of the book, but it 
has no place here. It applies, at least in its 
present form, to profit maximization with 
known probability distributions, which is es- 
sentially a trivial decision problem. 

Part IV contains a discussion of certain ex- 
perimental studies of certain game situations. 
Paul Hoffman, Leon Festinger, and Douglas 
Lawrence discuss the effect of presumed in- 
tellectual status and the importance of the 
game upon the formation of coalitions and 
division of the spoils. Their article is essen- 
tially a study in the sociology of decision- 
making. G. Kalisch, J. W. Milnor, J. Nash, 
and E. D. Nering discuss the formation of 
coalitions in general. Theirs is much more 
a contribution to game theory and, implicitly, 
to psychology. 

There is an interesting pair of articles by 
William K. Estes in Part II and by Merrill 
M. Flood in Part IV. Estes observed that 
subjects in experiments usually bet on the 
more probable outcome and sometimes on the 
less probable, “usually” and “sometimes” 
having about the same frequency as “more 
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probable” and “less probable,” respectively. 
Game theory implies that one should always 
bet on the most probable. Flood attempts 
to explain this discrepancy in terms of pos- 
sible changes in the probabilities or differ- 
ence in motivation. His explanation is plau- 
sible, but his data are, as he admits, incon- 
clusive. 

On the whole, this is a very important vol- 
ume for decision theorists, but most of them 
wrote it, or were at least present at the 
seminar. Its value to working psychologists 
and economists is less clear. Most of these 
are working papers, theories about how one 
might build a theory. On occasion there is a 
preoccupation with rigor which leads to un- 
due concern with rare cases. This is espe- 
cially true in the utility analysis. (Perhaps 
some of this concern is the result of the in- 
fluence of RAND Corporation personnel, to 
whom the equivalent of being beheaded at 
sunrise is not so remote from decision proc- 
esses as it is to an economist studying con- 
sumer demand.) Many social scientists will 
be concerned with the material of this vol- 
ume, for the problems it raises are immedi- 
ate, but one might wish for a somewhat more 
heuristic approach. 

Joseph P. McKenna 


Department of Economics 
St. Louis University 


Berdie, Ralph F. (with chapters by Wilbur 


L. Layton and Ben Willerman). After 
high school—what? Minneapolis: Univer. 
of Minnesota Press, 1954. Pp. xii + 240. 
$4.25. 

This volume reports the methods and re- 
sults of a major investigation of the post- 
high school plans of high school seniors and 
of a number of factors believed to be related 
to those plans. More specifically, the pur- 
pose of the study was “to investigate the fac- 
tors determining college attendance with par- 
ticular attention to a comparison of deter- 
minants related to economic status and those 
related to cultural or educational status.” 

The major technique employed was the 
ubiquitous questionnaire. Completed ques- 
tionnaires were obtained from approximately 
93 per cent of all Minnesota high school 
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seniors. ACE scores and rank in graduating 
class also were available. A statement of 
plans for the future was a major item of in- 
formation. ACE score, high school rank, 
sex, place of residence, age, marital status, 
home status (broken, etc.), parental occupa- 
tion, parental and sibling education, language 
background, family financial status, prospec- 
tive financial aid, family attitudes toward 
plans, student’s reasons for plans, high school 
curriculum, occupational goals, material pos- 
sessions, books and magazines in the home, 
and membership of parents in organizations 
were the “determinants” studied. 

A subsequent questionnaire follow-up of a 
10 per cent sample determined the extent to 
which plans had been carried out in order to 
answer the question: How did those who fol- 
lowed their plans differ from those who did 
not? 

In a subsidiary study, half-hour open-ended 
interviews were held with the parents of 90 
students in one community in order to con- 
firm some of the information collected in the 
questionnaire and to gather types of data, 
particularly those relating to parental atti- 
tude, that could not be efficiently collected 
throvch the questionnaires. 

The findings are too numerous and the re- 
lationships too compiex to enumerate in de- 
tail. However, the major findings can be 
stated briefly. There is a waste of talent; 
only 35 per cent plan on college (this pro- 
portion actually gets there). In addition, 32 
per cent of the high aptitude students do not 
plan to attend college. The waste is higher 
among girls than among boys. Geographic 
factors are important; students from farm 
areas were least likely to plan for college. 
Economic status and cultural status of the 
home are significantly related to college plans. 
Further, a “close relationship exists between 
the plans made by students during their 
senior year in high school and their subse- 
quent behavior during the following year.” 

Dr. Berdie and his associates stress the 
role of the home and family in college plan- 
ning although they give due rcognition to 
economic and ability factors. Thus the broad 
generalization, “To maximize the probability 
that an able student will go to college, have 
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him born into the right home.” The author 
outlines a program which includes activities 
directed toward the parents of high school 
students along with more conventional recom- 
mendations designed to improve the utiliza- 
tion of our intellectual resources. Counseling 
gets a big plug. 

The study reported here gains in signifi- 
cance because of its comprehensive coverage 
both with respect to the population studied 
and to the types of data included. If the 
generalizations sometimes seem to outrun the 
data (despite disclaimers), this is a state of 
affairs not completely unknown in survey re- 
search. A brief attempt was made to provide 
a theoretical orientation (Lewinian in na- 
ture) to the phenomenon of college attend- 
ance. An expansion of this conceptual frame- 
work would be of great interest. 

This is a solid contribution to current work 
on manpower problems. 

Arthur H. Brayfield 

Kansas State College 


Roethlisberger, F. J., and others. Training 
for human relations: an interim report. 
Boston: Harvard University Graduate 
School of Business Administration, 1954. 
Pp. xvii + 198. $2.00. 

The Harvard Business School, with funds 
made available by the Ford Foundation, be- 
gan a ten-year program of research and train- 
ing in human relations in 1951. This little 
book is a report of the first three years’ ex- 
perience. 

It describes the goals of the program, the 
clientele, the working philosophy of the staff, 
and the activities of the trainers and trainees. 
It gives a blow-by-blow account of the early 
struggles that any such program must en- 
dure. It presents a candid picture, in a most 
disarming manner, of the successive insights 
achieved by the participants and of the suc- 
cessive modifications of the program which 
resulted. It lists the case studies (29 of 
them) which have been produced thus far. 

The basic issue in human relations training 
is presented as a choice between two concep- 
tions. “Again and again we were forced to 
the conclusion that only by ‘sweating out’ 
the uncertainties, contradictions, and imper- 
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fections of behavior in himself and others 
does the practitioner become aware of the 
complexity of relationships in concrete phe- 
nomena and slowly gain confidence and com- 
petence in dealing with this complexity... . 
Yet in looking about us again we found some 
schools of thought in the behavioral sciences 
moving in another direction. They were try- 
ing to provide the practitioner with tools, 
techniques, ideals, skills, and attitudes which 
would lead to more desirable effects in his 
dealings with people and groups. Under this 
conception ‘skill’ becomes something to be 
learned, a technique, rather than a way of 
learning” (pp. 176-7). The authors con- 
clude: “It is our belief that not only those 
‘who stand and wait’ but those ‘who stand 
and sweat’ can also serve” (p. 180). 

Great care was taken in designing this pro- 
gram to avoid becoming addicted to the cult- 
ist wing of human relations training. This 


danger can be escaped, it is maintained, by 
recognizing that the world in which human 
relations take place is multidimensional. A 
trainer or practitioner who concentrates upon 
a single dimension is bound to be ineffective 
and frustrated because he is dealing with only 


part of this complex world. As a consequence 
of such frustration he is likely to become a 
rigid and obsessive proponent of a cultist 
movement in order to defend himself against 
his own doubts and insecurities. Despite the 
defensiveness of cults, many have contributed 
important, though limited, insights to the 
total problem of improving human relations. 
In developing this training program every 
effort was made to bring these insights into 
a single conception of human relations which 
will do justice to the complexity of the real 
world. 

For whom is this report written? This 
question is asked in the Preface, but the au- 
thors do not really answer it there. After 
reading the full report, this reviewer still can- 
not answer it. The book is essentially an 
autobiography of the project, written at the 
unusual age of three. In a sense it was writ- 
ten for the authors themselves and for those 
who participated in the program. It will be 
of considerable interest to those who may 
have a part in the project’s future. Finally, 
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it will do much to satisfy the curiosity of 
those who wonder about the private life of a 
training program for the trainers of prac- 
titioners of human relations. 
Dorwin Cartwright 
University of Michigan 


Hyman, Herbert H. (with Cobb, William J., 
Feldman, Jacob J., Hart, Clyde W., and 
Stember, Charles Herbert). Interviewing 
in social research. Chicago: Univer. of 
Chicago Press, 1954. Pp. 407. $8.00. 
Beginning with Stuart Rice’s classic analy- 

sis of “Contagious Bias in the Interview” in 
1929 many reports of studies on interviewer 
bias have appeared in the social science lit- 
erature. Most of these studies demonstrated 
that the interview is an unreliable method of 
collecting data. Now a major book appears 
which gives structure and integration to ex- 
isting studies and reports new research based 
on a sound conceptual basis. 

The book reports a series of studies by the 
National Opinion Research Center. Hart, in 
the foreword, describes the purpose of the 
studies as being: “(1) To determine and 
evaluate empirically the factors that may op- 
erate within the interview to produce error in 
the data derived from it, and (2) to test the 
amenability of these factors to methods of 
control designed to minimize their effects” 
(p. vii). The book is not, as the name im- 
plies, a treatise on interviewing methods, but 
is a series of investigations into the nature 
and sources of bias in the interview. Many 
persons and organizations contributed data 
and completed analyses to the report. Hy- 
man, the senior author, had the major re- 
sponsibility for guiding the research and pre- 
paring the book. 

The focus of the research is the interview 
in survey research, although it is obvious that 
many of the findings can be generalized to 
other types of interviews, not only in social 
science research but wherever the interview 
is used as a means of collecting information. 
Chapter 2 sets the problem and the con- 
ceptual framework which serves as a basis 
for the analyses which follow. The authors 
point out that historically research on the 
interview has been concerned largely with the 
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interviewer’s own attitudes and the effect of 
those attitudes on the interview results. They 
stress that this influence may be less impor- 
tant than other factors such as interviewers’ 
beliefs about the population, or how they ex- 
pect respondents to think and react to ques- 
tions. They point to the importance of other 
cognitive processes both in the interviewer 
and in the respondent, as well as to the inter- 
action forces such as rapport variations and 
mutual perception, as potential sources of 
bias. 

The book contains studies on problems of 
bias in the interviewer, the respondent, and 
interaction between the two. Much attention 
is paid to the part that the interviewer’s ex- 
pectations and his perception of the respond- 
ent play in maximizing potential for bias in 
probing, recording, and coding of the re- 
sponses. 

Other analyses are made of the biases in- 
volved in a way in which the respondent per- 
ceived the interviewer. Some of these percep- 
tions involve the interviewer himself, in terms 
of his perceived socioeconomic condition, race, 
group membership, etc. Other so-called 
“situational determinants” of interviewer ef- 


fect are also investigated, such as types of 
questions and the interviewer’s perception of 


how respondents will react to them. These 
are only a few of the areas which are ex- 
plored in this volume but they will serve to 
indicate the advance which the book repre- 
sents over the usual studies and concepts of 
interviewer bias. It is by far the most com- 
prehensive and best documented book in the 
field. In addition to the original research 
contributions, the authors have done an ex- 
haustive job in reviewing previous research 
in the area. 

This reviewer considers the book a major 
contribution to social science measurement. 
It is indispensable to social scientists who are 
using interviews for data collection. Fur- 
ther, the book marks the end of an era where 
fragmentary data of interviewer bias was 
worth an article. It opens new vistas for the 
methodologist. 

It is, however, primarily a book for the 
methodologist, not for the practitioner in in- 
terviewing methods. The book focuses much 
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more attention on documenting sources of 
error than it does on practical methods for 
overcoming error. The section on overcom- 
ing bias is sparse in practical solutions to 
field problems since the research is not easily 
translated into practical field suggestions. 
Rather it suggests areas for research which 
may eventually lead to practical guides for 
improving interviewing techniques. 
Charles F. Cannell 
Survey Research Center 
University of Michigan 


Woodson, Wesley E. Human _ engineering 
guide for equipment designers. Berkeley: 
Univer. of California Press, 1954. Pp. 262. 
$3.50. 

Prior to publication, a manuscript of this 
book was submitted to many experts in hu- 
man engineering from military, industrial, and 
university fields. Their comments and criti- 
cisms were invited. A comparison of the pre- 
liminary and final copies indicates that this 
preliminary review resulted in some signifi- 
cant improvements. The final copy includes 
some supplementary material and appears to 
be better organized than the original. 

The purpose of the book, as expressed in 
the Introduction, is ‘“‘to aid the designer in 
making optimum decisions whenever human 
factors are involved in man-operated equip- 
ment. . . .”. The author states that the book 
will provide a central source for information 
about the human operator, will point up the 
relative importance of variables which make 
a difference, and will indicate solutions for 
typical design problems. The book repre- 
sents a commendable effort to fulfill this 
purpose. 

An outstanding feature of this guide is the 
application of basic psychological and physio- 
logical data to specific problem areas. For 
example, the section on hearing, in addition 
to the conventional information on audition, 
suggests a way to improve speech intelligi- 
bility be wearing specially designed ear plugs. 
There are many illustrations to help the 
reader to interpret the data, which adds to 
the book’s value as a reference source. A 
short section in the introduction should be 
very helpful, because it enables the user to 
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establish system requirements. The designer 
must know what the system, composed of 
both men and machines, is to accomplish. 
Then he can emphasize those aspects of hu- 
man activity which make the greatest con- 
tribution to the over-all performance of the 
system. 

The first chapter of the book is devoted to 
the “Design of Equipment and Workspace” 
and closely approximates the standard format 
of an engineering handbook. This chapter 
makes up roughly 50 per cent of the book 
and is the one which will probably prove 
most understandable to design engineers. 
Other chapters incorporate what currently 
represents the remainder of general human 
engineering data. Information on vision and 
audition is presented in Chapters 2 and 3. 
While these chapters are fairly complete, the 
information is not presented in a form most 
appropriate for use by the intended reader. 
Considerable effort would be required by de- 
sign engineers to uncover those specific de- 
tails which would have most meaning for 
them. The same general criticism applied to 
Chapter 4 on “Body Measurements.” The 
anthropometric data, covering dimensions of 


various body members, limits of reach, and 
other functional information are difficult to 
interpret and to relate to specific problems. 
The major limitation of this Human Engi- 
neering Guide (since it is to be used by de- 
sign engineers) is that the organization and 
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presentation of material is along psychologi- 
cal lines. The average human engineering 
psychologist would have relatively little diffi- 
culty in using the data. In the industry to- 
day there is no indication, however, that the 
design engineer approaches his task in terms 
of vision, audition, body motion, or other hu- 
man characteristics. More likely the design 
engineer is confronted with autopilot system, 
electronic systems, and fuel systems, and 
simply considers the human operator as part 
of another system. 

Eventually the design engineer may start 
with the operator as the focal point; but at 
the moment, if we, as psychologists, expect 
the design engineers to incorporate our knowl- 
edge in their designs, we must adapt our 
information and recommendations to their 
design methods. This would suggest that fu- 
ture publications should be submitted to po- 
tential users—design engineers in this case— 
in addition to human engineers. 

The Human Engineering Guide for Equip- 
ment Designers represents a major step in the 
effort of human engineers to make their in- 
formation meaningful to equipment design- 
ers. It certainly should be part of the design 
engineer’s library. While it does not meet all 
of the designer’s needs, it nevertheless rep- 
resents the best the human engineering pro- 
fession has to offer at this time. 

J. W. Wissel 

Lockheed Aircraft Corporation 
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